Standing Committee Meeting Report Data Services

February 2020

Open Action Items from the Fall 2018 DSSC Meeting 

F2018:2 Develop a community letter supporting archiving historical data at the National Archives 

Responsible: Dave Wilson 

Status: (March 2019) Wilson spoke with the USGS Records Disposition Coordinator regarding a community letter in support of NARA accepting the WWSSN film chips.  He said he didn't think a letter is necessary at this time.  But if he hits a wall then he may be willing to consider it.   (Oct 2019) Hit a brick wall.  Participants at the Historic film chip workshop worried about NARA not releasing from archive.  Available at ASL in air-conditioned containers.  There was some concern about being able to maintain cost of climate control.  ASL has found a way to contain costs based on where they are housing the film chips.  (March 2020) SSA considering the issue through a SIG. 

 

F2018:5 Develop video tutorials for MUSTANG clients and other tools related to data access. Perhaps consider engaging a UW student intern           

Responsible: Carter 

Status: (March 2019)EPO was approached to see if Interns are a potential source of resources for this activity.  Initial discussions with both University of Oregon (Amanda Thomas) and Paul Bodin at the University of Washington have taken place.  There was no money found in the FY20 budget so it should be left as something to do if funds can be found. 

 

F2018:7Prioritize any potential NSF SAGE II Yr1 carryover from IS and DS for Matlab interfaces to MUSTANG metrics: Take to CoCom to coordinate with IS 

Responsible: Carter 

Status: (March 2019)There was no money found in the budget therefore it should be left as something to do if funds can be found. (Oct 2019) During the 2019 QAAC meeting, Jerry Carter asked what the desire of the community was for such an interface as (at that time) there was a possibility that funding might be found. A survey was sent out on 08/08/19 by Vedran Lekic and of the 14 respondents, 11 indicated that it would be useful. The budget could not support this project in 2019.  Funds may be available in the Y2 budget. (March 2020) No funds were available in the Y2 budget. 

  

Open Action Items from the Spring 2020 DSSC Meeting 

[S2020:1] Investigate inclusion of larger models in EMC in coordination with UNAVCO and report to DSSC.   

Responsible: Trabant 

 

[S2020:2] Solicit LLNL and Colorado School of Mines DAS data sets for possible posting on DAS RCN website (to use for community access and feedback). 

Responsible: Rodd/Bozdag/Woodward 

 

[S2020:3] Gather and report on long-term DAS data storage and metadata requirements. 

Responsible: Ajo-Franklin 

 

[S2020:4]. Revise Data Acceptance Policy for Data from Permanent Seismological Networks according to DSSC discussion, circulate to DSSC for approval and submit for Board approval.  

Responsible: Carter 

 

[S2020:5]. Revise Release of Restricted Data policy according to DSSC discussion, circulate to DSSC for approval and submit for CoCom acceptance.  

Responsible: Carter 

 

[S2020:6]. Summarize discussion on DS priorities in a possible future combined SAGE/GAGE facility and circulate to DSSC for comment.  

Responsible: Van der Lee 

  

Open Action Items from the Fall 2019 Joint IRIS & UNAVCO Data Services Committee Meeting 

[JF2019:1]Invite each other’s DS directors and committee chairs to all IRIS and UNAVCO data services committee meetings. 

Responsible:Carter/Meertens 

 

[JF2019:2]Establish a Cloud Solutions working group when appropriate for informing the Joint Data Services Committee about a joint cloud infrastructure. 

Responsible:(Directors and Committee Chairs) 

 

[JF2019:3]Capture the essence of what came out of this first joint meeting to provide a statement that can serve as a stepping stone to generating a Joint Committee Charge.  

Responsible:Stump/Elliot 

 

[JF2019:4]Develop a survey to be sent to the SAGE/GAGE community to solicit their views on the data services that they believe a joint facility should provide and the priorities of those services.  Because many persons in the community will not be aware of the need to consolidate, a statement should be included at the beginning of the survey to explain.  

Responsible:Stump/Elliot 

 

[JF2019:5]After informing the Boards, write a proposal to NSF to fund a joint Spring workshop that would solicit detailed community input on consolidated data services.  Early career scientists as well as other segments of the scientific and educational communities should be represented. 

Responsible:Elliot/Stump 

 

Brief Meeting Summary 

The committee heard from the liaison from the Board of Directors (BoD) who reiterated subjects that have been distributed to the community from the President and Board Chair. The liaison also expressed the BoDs appreciation for the proactive attitude of the joint Data Services Committee for governance of a joint SAGE/GAGE data services facility.  

Reports were given by the Director and the Deputies about the status of Data Services. There was a large increase in the volume of data from experimental deployments (nodal data) and this trend is expected to continue.  Inquiries about ingesting Digital Acoustic Sensing data have been received and DS is proactively working on the issues of such large datasets with the community.  Full SEED was retired in January 2020, but a complaint about the change prompted an explanation for the change that was sent to the community through the IRIS mailing list. Other changes that are being considered for the distant future by FDSN include a new miniSEED format that includes a Universal Resource Name (URN) in place of the existing (and limiting) Station, Network, Channel, Location (SNCL) description. The StationXML format for metadata is fully operational at the DMC as the new metadata standard for seismological data (although dataless SEED is still supported).  Data licensing was raised as a need that DS is pursuing from network data providers.  DS recommends the CC-BY license as it allows the data to be open and unrestricted to all. The Director also reported that DS is pursuing a Core Trust Seal certification to replace the Trusted Repository certification that has expired.  MUSTANG is now capable of computing metrics of data in the PH5 repository and was nearly finished with the computation at the time of the meeting.  The MUSTANG interface has also been improved by including several new color palates.  

Reports were presented by representatives of the data collection centers at UCSD, Albuquerque Seismological Laboratory and the PASSCAL Instrumentation Center.  A report from the Quality Assurance Advisory Committee (QAAC) was also provided. 

DS reported to the DSSC on its activities in response to the NSF review of NSF SAGE and NSF GAGE data services.  These activities include a joint project launched between IRIS and UNAVCO to build a Common Cloud Platform (CCP) for a combined SAGE/GAGE data facilities system; an updated security plan (also joint with UNAVCO); and an identity management system to respond to a request to provide statistics on what data are accessed, by whom, and for what purpose. Several other issues are also being addressed, some in the context of the joint CCP project. These include technical debt, change control processes, risk management, data projections, help desk time recording, and others.  

The Director reported that IRIS’s Magnetotelluric (MT) effort in the Instrument Services directorate is working on providing data in the PH5 and StationXML formats for archival.  The work is ongoing. There was also discussion about scans of historical seismological data. The committee recommended that if any such scans were offered for archival, they would continue to be evaluated on a case-by-case basis.  

Recent inquiries about archival of Digital Acoustic Sensing (DAS) data prompted a lively discussion on the potential impact of these data on DS – particularly metadata issues and data volumes both now and in the futureThe committee intends to be active in working on the issues and suggested that a moderately sized data set be made available (external to the DMC) for exploring issues related to DAS data. It was noted that a Research Coordination Network (RCN) has been funded with IRIS participation that will address some or all of these issues, but that DMC specific impacts must be addressed. 

The director presented the budget for the third year of the SAGE-II award and noted that some experienced staff time is required for the development of the SAGE/GAGE data services common cloud platform.  The budget was passed following discussion. 

The President of IRIS, Dr. Bob Detrick, was present and provided updates on NSFs Dear Colleague Letter that outlined their plans for a single SAGE/GAGE facility starting in 2023, the timeline for the solicitation for the next award, and the negotiations that are under way for possibly combining IRIS and UNAVCO. 

The Director presented a review of DS policies and recommended updates to Data Acceptance Policy for Seismological Data from Permanent Networks (which requires Board approval) and to the Release of Restricted Data policy (which requires DSSC approval). The recommended changes simplify the policies, remove links to non-IRIS web sites, and reorganize the text. After discussion and minor adjustments, the updates were approved.  

Annually, the DSSC reviews the DS strategic priorities. The committee felt that the existing ranking of data priorities was not appropriate and suggested that, while a distinction could be made between data from IRIS operated sources and non-IRIS operated sources, that the priority of continuous data vs. data from temporary deployments should be the same. 

A discussion followed on the DS priorities in a SAGE/GAGE combined facility and the DSSC noted that joint data services facility would have unique strengths: governance by the research community; deep and vast expertise of facility staff, and its substantial experience with the NSF SAGE and NSF GAGE research communities; and the new facility’s ability to deliver a suite of sophisticated data quality measures based on quality metrics developed by the community.  The DS strategic priorities document was agreed to be a nice starting point for this discussion and along with those priorities agreed that new trends in data and processing will strain existing resources and underscore the need for a more flexible and accommodating data archive. A nimble data archive that is optimally prepared for the next new research trends and potentially rapid changes in both seismology and geodesy is a priority for an integrated data services facility. 

 

After suggestions for a joint IRIS/UNAVCO DS meeting at the 2020 GAGE/SAGE workshop and potential venues for the IRIS DSSC meeting in the second week of October, the meeting was adjourned.