What is the SGP Data Package?
The SGP package provides classes and functions for calculating student growth percentiles and projections/trajectories using large scale, longitudinal education assessment data. This is accomplished by using quantitativeile regression to estimate the conditional density associated with each student’s achievement history, generating a coefficient matrix, and then performing percentile growth analysis to determine the number of students needed to reach a desired level of performance.
The term big data’ has been used in modern life to describe datasets that are too large for traditional analytical applications. The SGP aims to assemble an unprecedented amount of information for the scientific questions at hand but in comparison to, say, an analysis of global Facebook interactions it is small potatoes.
SGP members are working to assemble or generate multi-proxy sedimentary geochemical data (iron, carbon, sulfur, major and trace metal abundance) for the Neoproterozoic through Paleozoic epochs. This is a significant endeavor and involves both the collection and integration of existing data as well as new data generation. Unlike full community databases such as Genbank and EarthChem, the data that is compiled by research consortia such as SGP are focused on specific research questions rather than broad scientific topics.
To run SGP analyses, one will need a computer running the open source software environment, R. This software is available for Windows, OSX, and Linux. It is recommended that you spend some time familiarizing yourself with the R software before diving into running SGP analyses.
The SGP website contains detailed documentation on how to install and run the package. It also includes a variety of examples that demonstrate how to use the package to perform various SGP analyses. Additionally, there are a number of additional R resources that can be found on the CRAN website.
SGP is a collaborative project and relies on the contributions of many individuals. To contribute to the project please see the Contributing Guidelines. If you have any questions about contributing to the SGP, please contact us via email.
SGP aims to make its data freely available through the ARM Data Discovery system. Data Discovery contains a searchable database of all instruments at the SGP site and allows scientists to download both archival and near-real-time data. ARM transmits all observations gathered at the site to the ARM Data Center where they are made available to the scientific community. In addition to the archival and near-real-time observation data, SGP is building data sets that can be used for modeling and assimilation into Earth systems models. The LES ARM Symbiotic Simulation and Observation (LASSO) activity has developed a framework that offers modeling capabilities alongside instrument data to offer a more complete representation of the atmospheric environment surrounding the SGP. This allows researchers to perform a more holistic and self-consistent investigation of the site’s unique observing capabilities. The LASSO data are also transmitted to the ARM Data Center and made freely available through Data Discovery. The SGP has a wide range of research activities that use the data, including single-observation analyses, multi-observation process studies and assimilation into earth system models.