Girish Mahajan (Editor)

DataONE

Updated on
Edit
Like
Comment
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit

Data Observation Network for Earth (DataONE) is a platform for innovative, collaborative environmental and ecological science, using sustainable cyberinfrastructure and a distributed framework to provide open, robust, persistent and secure access to Earth observational data. Supported by funding from the National Science Foundation as one of the initial DataNet programs in 2009, DataONE works to ensure the preservation, access, use, and reuse of multi-discipline scientific data through the construction of primary cyberinfrastructure elements and the expansion of a broad and relevant education and outreach program. Currently in its second phase of funding, DataONE provides scientific data archiving for ecological and environmental data produced by scientists worldwide. DataONE's stated goal is to preserve and provide access to multi-scale, multi-discipline, and multi-national data. The global community of users for DataONE includes scientists, ecosystem managers, policy makers, students, educators, librarians, and the public.

Contents

DataONE links together existing cyberinfrastructure to provide a distributed framework, sound management, and robust technologies that enable long-term preservation of diverse multi-scale, multi-discipline, and multi-national observational data. The distributed framework is composed of Coordinating Nodes currently located at the Oak Ridge Campus at Tennessee, University of California Santa Barbara, and University of New Mexico, and many Member Nodes, located globally. DataONE also provides resources including: an Investigator Tool Kit that gives the DataONE users community tools for accessing and using DataONE efficiently.

Coordinating nodes

The three Coordinating Nodes provide network-wide services to Member Nodes. They are geographically replicated, with mirrored content and full copies of science metadata. The three Coordinating Nodes are:

  • University of New Mexico
  • Oak Ridge Campus (partnership of Oak Ridge National Laboratory (ORNL) and University of Tennessee)
  • University of California, Santa Barbara, UCSB
  • Member nodes

    The Member Nodes consist of Earth observing institutions, projects, and networks. They provide resources for their own data and replicated data, and focus on serving their specific constituencies. These member nodes are geographically distributed and consist of diverse implementations. Current Member Nodes include:

  • Cornell Lab of Ornithology eBird
  • Dryad
  • Earth Data Analysis Center (EDAC)
  • Environmental Data for the Oak Ridge Area (EDORA)
  • Ecological Society of America (ESA) Data Registry
  • Europe Long-Term Ecosystem Research Network (LTER Europe)
  • Global Lake Ecological Observatory Network (GLEON)
  • Gulf of Alaska Data Portal
  • International Arctic Research Center (IARC) Data Archive
  • Knowledge Network for Biocomplexity
  • Long Term Ecological Research Network (LTER)
  • Merritt Repository
  • Minnesota Population Center (MPC)
  • Montana IoE Data Repository
  • Nevada Research Data Center
  • New Mexico Experimental Program to Stimulate Competitive Research (NM EPSCoR)
  • NOAA National Centers for Environmental Information (NCEI) Oceanographic Data Archive
  • ONEShare Repository
  • ORNL Distributed Active Archive Center
  • Partnership for Interdisciplinary Studies of Coastal Oceans (PISCO)
  • Program for Research on Biodiversity (PPPBio) [1]
  • Regional and Global Biogeochemical Dynamics Data (RGD)
  • SANParks Data Repository
  • SEAD Virtual Archive
  • Taiwan Forestry Research Institute
  • Terrestrial Ecosystem Research Network (TERN)
  • University of Kansas - Biodiversity Institute
  • USA National Phenology Network
  • USGS Science Data Catalog (SDC)
  • Investigator Tool Kit

    The Tool Kit provides tools for researchers to access DataONE. These are both general purpose and discipline-specific tools, and DataONE developers adapt existing tools where possible. The Tool Kit includes Java and Python libraries, an R programming language plug-in for analysis, extensions for Excel, the VisTrails scientific workflow, and the Kepler scientific workflow system.

    Data management

    DataONE provides a place for scientists to store data and its associated metadata. The metadata makes this data searchable and accessible to other scientists. Data management practices include

  • Data management planning
  • Data acquisition (techniques, protocols, methods)
  • Data protection (backing up)
  • Data entry and manipulation (naming files, organization) Matlab, R
  • Quality control on data
  • Data analysis
  • Workflow tools (VisTrails, Kepler scientific workflow system)
  • Data documentation (metadata)
  • Data sharing, citation, and discovery
  • Data preservation & curation
  • Some of the additional data management planning resources include: a primer for best practices, a database for best practices in data management, educational modules and tutorials, webinars, and an investigator toolkit. Many of these resources have been used and/or adapted for use under Creative Commons license by organizations and institutions that seek to educate other communities about data and research management. Understanding different audiences of users led to the development of possible user personas [2] as models for users such as early-career researchers, science data librarians, citizen scientists, K-12 educators, and others.

    Collaborations

    DataONE collaborates with other institutions to bring together tools that help with good, sustainable data management practices. One of those tools, developed in collaboration with other organizations and hosted by the University of California Digital Curation Center, is the DMPTool for data management planning.[3] The DMP Tool is used by and referenced by many research data management plans and institutions in the US and around the world. Another recent collaboration in this area is the shared construction of a Data Management Training Clearinghouse for Earth sciences, in partnership with USGS and the Community for Data Integration (CDI). [4]

    DataONE community

    The DataONE community includes research networks, professional societies, libraries, academic institutions, data centers, data repositories, environmental observatory networks, educators, scientists, policy makers, administrators, citizen scientists, international organizations, NGOs, ecosystem managers, students, private companies and the public.

    DataONE has an active worldwide users group (called the DUG for "DataONE Users Group") that represents a wide range of diverse stakeholders. The DUG meets on an annual basis in the summer and provides feedback from users to DataONE that guides areas of interest for future work and helps DataONE to reach its stated goals.

    References

    DataONE Wikipedia