Presented by Glenn Lockwood (NERSC) The National Energy Research Scientific Computing Center (NERSC) has been operating since 1974 and has been storing and preserving user data continuously for over 45 years as a result. This has resulted in NERSC building significant expertise in how to store and manage user data for long periods of time–a decade or more–and the practical factors that must be considered when data must be retained for longer than the lifetime of the physical components of the data center, including the entire data center facility itself. As the relevance of HPC extends beyond modeling and simulation and the usable lifetime of data extends from months to years or decades, these best practices in long-term data stewardship are likely to become more important to more HPC facilities. To this end, we present here some of the practical considerations, best practices, and lessons learned from managing the scientific data of NERSCs thousands of users over a period of four decades.
Managing Decades of Scientific Data in Practice at NERSC