Title: A guide to the NCVS database
1A guide to the NCVS database
- Plus some nifty stuff about Access
- Michael Lee
- 11/29/2001
2What is a database?
- A database is an organized structure for
- Storing
- Manipulating
- Entering
- and Reporting data
- Each of these is performed by
- Tables
- Queries
- Forms
- Reports
..........................
................ ........................
......
A database exists as a single file on a computer.
In MS Access, the files are named with extension
.mdb
3The Database Window
- Shows all the objects in the database.
- Select different types of objects.
- Has large icon, small icon, list, and details
views of objects (like Windows). - Hit F11 to get there.
4Tables
- Many different tables can be in each database
open by double clicking on icon.
- Here, the Categories table contains column
headers (Field Names) that tell you about the
information below.
- Click the straight edge to go to design view
and find out more about the table.
5Tables Design View
- Design view shows the field names and their
properties which restrict what kind of data and
how much can be entered into a field.
- Data types can be text, number, memo (very long
text), hyperlinks, etc. - Field sizes limit how long the data can be.
- Descriptions tell you more about what the field
actually is (very helpful later in the NCVS
database).
6Relationships connecting tables
- You can see the relationships by clicking the
relationships icon, or by menu ToolsRelationships
.
- Relationships connect 2 tables Categories and
products through one field CategoryID in each
table. - The two fields dont need to have the same name,
but must be of the same data type. - Often, one field is a primary key (bold), which
has unique values (that is, only one occurrence
of each value in the table). - Relationships are most useful if they are 1 to 1
or 1 to many. Here, all are 1 to many.
7Queries
- The simplest query is a SELECT query you select
which table(s) and fields you would like
displayed.
- You can also limit which records (rows) are
displayed by specifying criteria.
8Queries - calculations
- You can add fields which will calculate values
based on other fields, here from two tables,
linked by a relationship. - Name a field with the field name and a colon,
then an expression.
- Click the Expression Builder icon to better see
or write a fields expression. - Built-in functions are quite useful when creating
expressions.
9Queries - totals
- Click the Sigma icon to see Total line below.
- Group by fields will show unique values in
rows. - Other fields will have calculations performed on
all records that match the group by fields from
the original table.
10Queries other types
- Crosstab queries
- like Pivot Tables in Excel.
- Values become field names.
- Fields are calculated based on row and column
headings.
- Append queries
- Add records to a table from another table.
- Union queries
- Similar to stacking datasets in SAS.
- Stacks multiple tables so that all records in
all tables are present in the query.
- Make-table queries
- Create a table that stores the data currently
queried. - Useful with complex queries than run slowly.
- The NCVS database contains all these types of
queries. - There are other types of queries that may become
useful to you see Access Help.
- Update queries
- Update certain fields of a table to new values
based on an expression
11Forms
- Based on one or multiple tables or queries.
- Simpler view of complex data.
- Used for data entry.
- Fields can be locked so that editing them is
prevented. - Forms can have buttons that can sort, print,
edit, etc.
12Reports
- Based on one or multiple tables or queries.
- Used to view data, often summarized by groups,
here by category.
13Thats the basics of Access
- MS Access provides a good help menu that will
guide you through how to use Access. - The sample database is also helpful in figuring
out how things work. It is located here - \\Uniola\C\Program Files\Microsoft
Office\Office\Samples\Northwind.mdb
14NCVS Database
15NCVS Database
- database folder location
- CData\NCVS\Copy_NCVS_Database\In-Out\
- The database is named according to the copy date
in this In-Out folder, - like Nov29_2001_NCVSProto4.mdb
- If you wish to use the database, copy it from the
In-Out folder to your own computer space (i.e.
\users\YourName\ or your own computer) - This allows all of us to be working with a fresh
copy
16NCVS Database stats
- The NCVS Database currently contains
- 4,648 plots
- 321 USGS Quadrangles in 5 states
- 40 different projects
- 3,135 different taxa
- 171,933 woody stems
- 423 people participating (still incomplete)
17NCVS Database basic structure
- There are about 23 tables that contain
information directly relating to the plots. - Another 13 tables contain support information
that help interpret the tables that relate
directly to the plots. i.e. The Counties table
contain names of counties, from which File1 can
select to assign a plot to a county.
18NCVS Relationships
This report (rel_ver5 ) is in the NCVS database.
19Overview of NCVS tables
- Master Lists
- All_Plots
- Project
- Vegetative Data
- HerbData
- TreeDataSml
- TreeDataBig
- Vegetative Attributes
- File3_VegAttr
- StratumPlot
- Environment
- File2_Site_Attributes
- File4_McNab_Indices
- GroundCoverPlot
- DisturbancePlot
- Soil Data
- File5_SoilDepth
- Soil_Nutr
- Soil_Text
- Plot Location and Layout
- File1_Plot_Summary
- PlotPlace
- Classification
- ClassEvent
- ClassAssign
- ClassContributor
- Contribution by People
- PlotContributor
- ProjectContributor
- Documentation
- FieldDefn
- Notes
20Group Name
Format of Following Slides
- Table Name (in italics)
- Primary key FieldName
- Relates to table(s) (by FieldName or
pkprimary key) - If (by FieldName) is absent, then the tables
link by project_team_plot (a.k.a. plotID) - General information about table and useful tips
about table
Graphic of group with group name, tables, and
fields.
21Master Lists (p1)
- All_Plots
- Primary key Project_Team_Plot
- Relates to Project (by Project) and most
tables that have direct data on plots - One record per plot
- No plot data can exist in a table without a
corresponding plot in All_Plots - Contains Project, Team, and Plot in
separate fields
22Master Lists (p2)
- Project
- Primary key Project_ID (project number)
- Relates to All_Plots, Notes, Field_Defn,
Project Contributor (ALL by pk) - One record per project
- Contains useful descriptions about project
- Specifically ProjRegion, which divides projects
into CP, LL, M, PD, OT
23Documentation (p1)
- Field_Defn
- Primary key fieldDefnID
- Relates to Project (by Project)
- One record per field definition
- FieldDesc defines field values
- Often interprets codes
- Gives ranges of values
- If project is blank, then definition refers to
all projects, otherwise, only the project
mentioned in project
24Documentation (p2)
- Notes
- Primary key NoteID
- Relates to All_Plots, Project (by
Project_entire) - One record per note (multiple notes per project
and/or plot) - Contains notes about plots and/or projects.
- NoteTypes
- Primary key NoteTypeID
- Relates to Notes (by pk)
- Contains valid types of notes
25Vegetative Data (p1)
- HerbData
- Primary key HerbDataID
- Relates to CarSpList (by SppID), All_Plots
- One record per species per module per plot
- Summary module S shows overall composition of
each plot - Spatial scale information for up to 5 corners
c1-c5 - Cover classes cov
- Cover classes for up to 7 strata ns1-ns7
- Cover classes are according to NCVS scale (see
Field_Defn)
26Vegetative Data (p2)
- TreeDataSml
- Primary key TreeLineID
- Relates to CarSpList (by SppID), All_Plots,
TreeDataBig (by pk) - One record per species per module per plot
- Summary module S shows overall composition of
each plot - 10 fields for number of stems in each size class
(standard NCVS classes) d0,d1,...d35 - overall subsampling percent for each species in
each module - NewSubS (for saplings d0,d1)
- NewSubT (for trees all others)
27Vegetative Data (p3)
- TreeDataBig
- Primary key TreeDataBigID
- Relates to TreeDataSml (by TreeLineID)
- One record per stem
- Module, plotID, SppID and other relavent
info (like NewSubT) are found in linked
TreeDataSml - bigtree is the dbh of the large stem (gt40cm)
28Species Datatables (p1)
- CarSpList
- Primary key SppID
- Relates to HerbData (by pk), TreeDataSml (by
pk), LatestVersionCarolSpDB (by pk) - One record per taxon
- Contains taxonomic data to interpret SppID in
VegData tables - Updated via an Action Query to include any
changes in LatestVersionCarolSpDB
29Species Datatables (p2)
- LatestVersionCarolSpDB
- Primary key SppID
- Relates to CarSpList (by pk)
- One record per taxon
- A linked table (window into this database) from
the Carolina Species Database
30Vegetative Attributes (p1)
- File3_VegAttr
- Primary key Project-Team-Plot
- Relates to All_Plots, StratumPlot
- One record per plot
- Contains data about what the vegetation of the
entire plot is like - Physiognomic Class is overall type of vegetation,
i.e. Forest, Savanna, Shrubland, etc. - Field_Defn has more field info
- Some fields have been replaced by a new table
StratumPlot, but we arent sure if the new
format will stick, so they are still (mostly)
preserved (i.e. EMaxHt)
31Vegetative Attributes (p2)
- StratumPlot
- Primary key StratumPlotID
- Relates to File3_VegAttr, StratumType (by
StratumTypeID) - One record per stratum per plot
- Contains definitions of the vertical strata of a
plot that may be referred to in HerbData.ns1,
.ns2,... - StratumType
- Primary key StratumTypeID
- Relates to StratumPlot (by pk)
- One record per stratum type
- Provides a list of standard strata that may used
to describe a plot
32Environment (p1)
- File2_Site_Attributes
- Primary key Project-Team-Plot
- Relates to GroundCoverPlot, All_Plots,
DisturbancePlot - One record per plot
- Contains information about many environmental
variables, such as - slope, aspect, elevation
- soil description and types (soil series and other
variables) - hydrologic variables
- Fields relating to Ground Cover and Disturbance
now are in separate tables - Many fields are blank for many plots
33Environment (p2)
- GroundCoverPlot
- Primary key GroundCoverPlotID
- Relates to File2_Site_Attributes,
GroundCoverType (by GCTypeID) - One record per Ground Cover Type per plot
- Contains percent cover for each Ground Cover Type
for each plot - GroundCoverType
- Primary key GroundCoverTypeID
- Relates to GroundCoverPlot (by pk)
- One record per Ground Cover type
- Provides a list of standard Ground Cover Types
that may used to describe a plot (Bedrock,
Litter, Water, etc.)
34Environment (p3)
- DisturbancePlot
- Primary key DisturbancePlotID
- Relates to File2_Site_Attributes,
DisturbanceType (by DisturbanceTypeID) - One record per Disturbance Type per plot
- Contains severity and description of Disturbance
for each type on a plot - DisturbanceType
- Primary key DisturbanceTypeID
- Relates to DisturbancePlot (by pk)
- One record per Disturbance Type
- Provides a list of standard Disturbance Types
that may used to describe a plot (Human, Natural,
Fire, Animal)
35Environment (p4)
- File4_McNab_Indices
- Primary key Project-Team-Plot
- Relates to All_Plots
- One record per plot (Mtn plots only)
- McNab Indices measure the bowl-shaped-ness or
ridge-shaped-ness - Contains LFI and TSI inclinations (in degrees) at
8 angles or the calculated LFI and TSI (if
individual angles are not available) - LFI is LandForm Index
- (angle to horizon)
- TSI is Terrain Shape Index
- (angle formed by local slope shape, 10m scale)
36Soil Data (p1)
- File5_SoilDepth
- Primary key DepthID
- Relates to All_Plots
- One record per corner per module per plot (16
records for a standard plot) - Depth is to impermeable layer, in cm
- Some module or corner values are text to
indicate max, min, or avg (where raw data
unavailable)
37Soil Data (p2)
- Soil_Nutr
- Primary key Project_Team_Plot
- Relates to All_Plots
- One record per module per horizon per plot
- Contains results of nutrient analysis of soil
samples - ModuleC means that the values for that record
are from a composite of the different modules
soil (or values)
38Soil Data (p3)
- Soil_Text
- Primary key Project_Team_Plot
- Relates to All_Plots
- One record per module per horizon per plot
- Contains results of texture analysis of soil
samples - ModuleC means that the values for that record
are from a composite of the different modules
soil (or values)
39Plot Method and Location (p1)
- File1_Plot_Summary
- Primary key Project-Team-Plot
- Relates to All_plots, PlotPlace, States (by
State Abrv), Counties (by County ID),
MapQuadrangles (by Quadrangle ID) - One record per plot
- Contains Location Information
- UTM Easting, Northing, and Zone
- Latitude and Longitude
- Estimated Error in Coordinates
- County, State, Quadrangle (foreign keys)
- Methodology
- Plot size (herb and tree), Date, Photo Data
- CoverMethod is method of herb sampling
40Plot Method and Location (p2)
- MapQuadrangles
- Primary key QuadrangleID
- Relates to File1_Plot_Summary, (by pk)
- One record per Quadrangle
- Contains Quadrangle Information
- Quadrangle Name and State(s)
- Quadrangle Base Coordinates
- QuadrangleID (number) is stored in
File1_Plot_Summary, not Quadrangle Name - Quadrangle Name appears in File1 because of
settings on Lookup table - Quadrangle name can be queried from
MapQuadrangles table
41Plot Method and Location (p3)
- Counties
- Primary key County ID
- Relates to File1_Plot_Summary, (by pk)
- One record per county per state
- Contains County Name and State
- As with MapQuadrangles, County ID (number) is
stored in File1_Plot_Summary - State
- Primary key Abbrev
- Relates to File1_Plot_Summary, (by pk)
- One record per state
- Contains State Abbreviation and State
- As with MapQuadrangles, Abbrev is stored in
File1_Plot_Summary
42Plot Method and Location (p4)
- PlotPlace
- Primary key PlotPlaceID
- Relates to File1_Plot_Summary, PlaceNames (by
NamedPlace) - One record per Place Name per plot
- Assigns a plot to one or more named Places
- PlaceID is stored from PlaceNames
- PlaceNames
- Primary key PlaceID
- Relates to PlotPlace, (by pk)
- Contains valid Place names from which PlotPlace
can select
43Party
- Party
- Primary key PartyID
- Relates to PlotContributor (by pk),
ClassContributor (by pk), ProjectContributor
(by pk) - One record per person
- Contains names and contact information for people
who have contributed to the NCVS dataset in some
manner (see Roles for different contribution
types)
44Contributor (p1)
- PlotContributor
- Primary key PlotContributorID
- Relates to All_Plots, Party (by PartyID),
Roles (by RoleID) - One record per person (per role) per plot
- Credits a person (Party) with contributing to a
plot in a particular role (Roles) - Roles
- Primary key RoleID
- Relates to PlotContributor (by pk),
ProjectContributor (by pk) - Contains the valid possible roles to contribute
either to plot or project
45Contributor (p2)
- ProjectContributor
- Primary key ProjContribID
- Relates to Project (by ProjectNumber, Party
(by PartyID), Roles (by RoleID) - One record per person (per role) per project
- Very similar to PlotContributor, but for
projects instead of plots - Credits a person (Party) with contributing to a
project in a particular role (Roles) - Mainly for recording status of projects in data
entry
46Classification (p1)
- ClassEvent
- Primary key ClassEventID
- Relates to All_Plots (by PlotObsID
PlotID), ClassAssign (by pk),
ClassContributor (by pk) - One record per Classification Event per plot
- A Classification Event is an effort by one or
more people to classify a plot - Contains
- Method of classification
- Notes on overall classification event
- Date of classification event
47Classification (p2)
- ClassAssign
- Primary key ClassAssignID
- Relates to ClassEvent (by ClassEventID),
ClassCodes (by ClassCode) - One record per Classification Assignment per plot
- A Classification Assignment is a plot assigned to
a particular CEGL code, Alliance, or Association - Contains
- Fit- how closely plot matches typal
classification community - Confidence- how sure the classification and fit
are - Notes on the particular assignment
48Classification (p3)
- ClassCodes
- Primary key CEGL_All_Assn_Code (that is CEGL
code, Alliance code, or Association code) - Relates to ClassAssign (by pk)
- One record per classification type, (Community,
Alliance, or Association) - Contains
- Formation 5 strings (. delimited) that show the
lower resolution groups of the particular
classification type (IV.A.2.N.a) - Common names of classification types
- Other miscellaneous support info
49Classification (p4)
- ClassContributor
- Primary key ClassContribID
- Relates to ClassEvent (by ClassEventID,
Party (by PartyID) - One record per classification event per person
contributing - Very similar to PlotContributor
- Contains party members who contributed to the
classification event
50Tables more information
- To find out more about a particular table, click
the details view on the database window - There you can see the description field for each
table, which describes each important table - To find out more about each field in a table,
click on design view and read the description
field there - Description of a field also appears in lower left
hand corner of the window when the cursor is in a
field of a table in datasheet view
51NCVS Database
52Queries (extant) (p1)
- HerbRichness_all_scales
- a make-table query that creates the table
HerbData_scalar_richness - These show the richness at the available spatial
scales for each module - HerbData_withNC_Code and TreeDataSml_withNC_Cod
e - Show most up to date NC_Codes with Herb or Tree
Data - Scientific Name or other taxonomic data can
easily be added from CarSpList
53Queries (extant) (p2)
- ProjectStats show interesting information about
how many plots and their combined size for each
project - W_ is the prefix for queries that are used by
other queries deleting these may cause other
queries, forms, or tables to cease functioning - Z_ is the prefix for queries that tell us about
the overall status of the database - Percent of each field complete for each project
- Range of values
54Reports
- As with Queries, Z_ means the report shows us
the status of the database - Z_Done gives percentages of plots with values
in each field - Z_Table gives the ranges of values for each
field
55Reports Z_Done
56Reports Z_File2
- This type of report shows either
- the range of values (max and min) if numeric
- All values used, and their frequency
57Forms
- Phase4Entering is used for data entry of the
cover sheets - two page form
- Has subforms which display multiple records per
plot (i.e. Ground Cover, Soil Depths, etc.) - You can view almost all relevant data about a
plot (except VegData) through this Form
58Making Your Own Queries
- Most questions require a bit of manipulation to
the data, so you will probably need to know how
to make your own queries
59Tips on creating queries (p1)
- Know the data
- Know what the fields mean and how complete the
plots are for that field - Do one thing at a time
- Add one table at a time and make sure youre
creating the results you want - Use relationships already in existence before
adding new relationships in query
60Tips on creating queries (p2)
- Make sure you arent losing plots as you add
tables to queries check the joins (inner and
outer) - Make queries out of other queries if things start
to seem too complex - If your queries run too slowly, consider changing
a query to a make-table query and base the next
query on the table, which is then stored on disk
61Creating a query
- Question What is the correlation of slope and
Project Region? - Step 1 Know where the applicable data are
- Project contains project regions and
File2_Site_Attributes contains slope
62Creating a query
- Step 2 Create query
- Double Click Create query in Design view
- OR use wizard
- Well use Design view, since thats more helpful
in editing queries once they are designed
63Creating a query
- Step 3 Show tables
- Start with the table that has more (or more
complex) data - Here, File2_Site_Attributes
64Creating a query
- Step 4 Add fields
- Slope
- Double click Slope or click, hold, and drag
to Field cell - Click Datasheet View to see what you have
65Creating a query
- Step 5 How many plots and how much data do you
have? - At bottom of window, you see that 4191 records
exist - Since there is one plot per record in this table,
we know we have 4191 plots - There are missing values
- You can check how many values are completed in
Z_Done reports
66Creating a query
- Step 6 How much data is there?
- For Slope, many projects have near 100
presence, some have less (50) - Then decide if this is acceptable and proceed if
so
67Creating a query
- Step 7 Back in Design View, add next table
- Click Show Table icon
- Show All_plots which relates File2 and
project - Go back to Datasheet view and notice that we have
4648 plots more than before because we have a
Outer Join from All_Plots to File2
68Creating a query
- Step 8 Change join, if you need to
- This doesnt change the permanent relationship of
tables, but does change the way the tables relate
inside this query - Select whichever JOIN option youd like
- Outer joins where you include all records from
one table can cause ambiguous queries you may
have to create multiple queries - Inner Join is fine here
69Creating a query
- Step 9 Add next table and field
- project.ProjRegion
- View results
- Here, we have Slope, PlotID, Project, and
ProjRegion - We also have 4191 plots, which is how many plots
have slope records (some blank)
70Creating a query
- Step 10 Show Totals in Query to calculate
average slope across Project Regions - Click Show Totals icon and Total Line appears
below - We want to group by ProjRegion and average
Slope - We would like to ignore Project and PlotID
- Delete the fields from the query or select
where which matches criteria (none specified
here)
71Creating a query
- Step 11 View Results
- Click datasheet view
- We see Project Regions with their average slope
- Maybe we would like to exclude blank project
regions or sort data - Back to design view
72Creating a query
- Step 12 Final Changes
- Sort (ascending or descending) by fields, from
left to right (if multiple sort fields are
specified) - Change criteria
- Change field names
- See results
73Creating a query
- Step 13 Done
- Here are the results, Mountain Regions are
steeper than others in the dataset (whew!) - Now, you could do statistical analysis on this,
or unselect the Show Totals line and analyze
the entire set - Save your query if you want to keep it FileSave
- Maybe name it with your initials and underscore
then the name (to keep separate from other
queries) i.e. ML_AverageRegionalSlope
74Useful expressions for Queries
- Use Is Null or Is Not Null to indicate that you
want Null or non-null records to display (in
criteria) - Criteria are AND in the same row and OR in
different rows - Iif(expr,exprIfTrue,exprIfFalse)
- If expr is true, then the second expression is
used, otherwise the third - If you set criteria, null values never are
included, unless you criteria includes Is Null - Use brackets for field Names, quotations for
strings - combines fields
- IsNull(FieldName) returns true if the field is
null, false otherwise - Example of last several
- Iif(nameltgtJoe or IsNull(name), Not Joe,
but isname, Yes, its Joe)
75Final Access tips
- Access saves records whenever you move to the
next record you cant undo changes youve made
(sometimes 1 undo is permitted) - Access 2002 may be different
- You can export to Excel format (Tables, Queries)
- If you want to export from a query, it must be
saved before you can do so - SAS can also read in directly from the Access
Database for higher power manipulation and
analysis