Benefits from well-managed data

Data is a key resource for our society, and managing it is a prerequisite for the circular economy of data. In order for data to be reused, its life cycle must be managed, and the services must enable and support responsible action throughout this life cycle.

For data, documentation (metadata) is needed, as well as platforms and services where both data and metadata can be processed in a controlled manner. Strong expertise and extensive cooperation are required to realize the FAIR principles, openness, cooperation and reuse of data. A rapidly developing and multidisciplinary field creates challenges that can only be solved together, user centrically and through promoting good practices as well as by creating services that are both secure and accessible.

A record year in long-term digital preservation

Digital long-term preservation ensures the preservation and availability of digital data for the needs of researchers and other users for decades or even centuries. To achieve this, CSC has successfully developed and maintained capabilities for digital preservation that are significant by national standards and that ensure the management of data integrity, authenticity and file formats, quality assurance and continuity of the activities.

As part of providing digital preservation services, CSC also supports the service user organisations, sharing its competence related to capabilities; together with the organisations, CSC strives to promote understanding and expertise related to the management and storage of digital data. The digital preservation services are provided under an agreement between CSC and the Ministry of Education and Culture.

Development of the digital preservation services

In 2021, the user organizations of digital preservation services succeeded in transferring a record-breaking volume of material for storage, or over 580 terabytes. This means that the organizations successfully transferred more than 1.5 terabytes of material for storage every day of the year. By now, more than 1.6 petabytes of significant cultural heritage and research data have been approved for storage. In 2021, several new preservation agreements were concluded, especially with higher education institutions.

A specification of requirements for logical storage was produced together with users of the long-term preservation service. These requirements specify the division of labour between digital preservation services and the organizations that store material, especially in preserving the usability of materials. In addition, successful implementation of the SAPA platform built for the needs of the National Archives of Finland continued. This success was also reflected as significant growth in the volume of materials in the Digital Preservation Service for Cultural Heritage of the National Archives of Finland.

ELIXIR Finland

As stated in Treaty no 7/2015, CSC operates as ELIXIR Finland’s node. ELIXIR is a geographically distributed European research infrastructure for organizing, storing and sharing life-science data produced by public research following the FAIR principles. ELIXIR's data resources can be accessed in CSC's DL environment. ELIXIR Finland is included in the Academy of Finland's roadmap for national research infrastructures (FIRI).

In 2021, good data management principles underpinned by the competence of the entire European ELIXIR network were made available for all Finnish researchers in the online service ELIXIR Research Data Management Kit (RDMkit). The service provides practical instructions for researchers that cover the entire life cycle of data. The service was welcomed in Finland, for example by the Academy of Finland.

CSC’s ELIXIR activities support the efforts to develop management services for health and life science data, in particular. Finland focuses on computing services as well as requirements and solutions for sensitive data, and this work is closely linked to the activities of national centers of expertise. Genome data, bioimaging data sets, register data and other human research data obtained by consent are some of the relevant contents. The work supports the development of AI algorithms, computing services, research use of health data and data management technology. The themes of the development work produce solutions for building secure services for using sensitive data at CSC. The success of this work is actively measured together with the Finnish Biobank Cooperative FINBB, the Finnish Institute for Health and Welfare, Euro-BioImaging Finland and the FIMM Technology Centre at the University of Helsinki.

New management services for sensitive data to support national research and education

CSC continued to develop diverse expertise in sensitive data management in 2021. In June, we released beta versions of the new SD service package. The SD Connect service enables research organizations to transfer data sets to and store them in CSC's system easily and securely while ensuring the scalability of the data. Research project data sets can be analyzed by all members participating in the project in a secure cloud computing solution provided by CSC. Access to this solution from the user’s device is offered through a secure remote desktop service, SD Desktop. In the next phase, we will expand the SD service family to include services for secondary use of data.

User experience has been taken into account from the start in SD service development, and the feedback on service deployment has been mostly positive. During the first year, the services have been used as part of life science, medical, linguistics, economy and various software development projects, among other things. In addition, we have piloted their use in the Ministry of Education and Culture's experiment on free early childhood education and care. The customer base for the research use of sensitive data expanded during the first year. The leaders of Finnish research groups can now also easily invite their international partners to join their projects. User identification and offering access to the data are based on CSC's long-term cooperation within the European research infrastructure and underpinned by international standards. The controller’s role is always assumed by the data owners, whereas CSC operates as a processor of personal data. In the SD services, data are always stored securely as part of CSC's national data management and computing research environment.

CSC participated in supporting data management in national COVID-19 research projects and also verified the virus genome sequencing results of COVID-19 positive patients commissioned by the Finnish Institute for Health and Welfare. We additionally coordinated work on a common European information model for collecting clinical histories of patients treated for COVID-19 for research use. We coordinated the incorporation of virus genomes produced in national research projects into the European Nucleotide Archive service, which enables the secondary use of research data. We additionally prepared for receiving sensitive research data collected from patients as part of CSC's forthcoming Federated European Genome-phenome Archive service.

We also continued our cooperation on developing the service package for sensitive data with the Ministry of Social Affairs and Health. CSC participates in the European 1 + Million Genomes (1 + MG) project coordinated by the Ministry of Social Affairs and Health by building, in cooperation with the Finnish Institute for Health and Welfare, services for the reception, processing and findability of synthetic genomes tailored for pilot use. In addition, CSC coordinates the construction of a federated, secure, cross-border, technical infrastructure at the European level for the 1 + MG project. CSC is also the supplier of information systems for the Finnish Social and Health Data Permit Authority Findata, and the company is preparing to provide SD services for the data for which Findata grants access permits for the needs of research and education. This hectic development work will continue in 2022.

The Data Support Network and the Research Data Management Competence Centre made headway

The work of the Research Data Management Competence Centre and the Data Support Network was developed and cooperation with data support personnel in organizations was stepped up. Competence and training needs were mapped using a Webropol survey, an interactive webinar and informal discussions held once a month. Increasing numbers of courses and workshops will be organized in collaboration with different research organizations on themes that came up in the mapping of needs. CSC put together an English-language data management course that can be completed independently. It covers the basics of data management and explains what resources or tools are available for the different stages. This course, which is open for all data support persons and scientists in research organizations, helps to understand the basics and practices of sensible data management.

Fairdata services promote open research and science

CSC provides Fairdata services that enable higher education institutions and research institutes to promote the openness, accessibility and preservation of research data. The services enable the processing of research data from (raw) data into high-quality, accessible data sets with metadata, the reuse and long-term preservation of which can be guaranteed.

The efforts to develop the Fairdata services have meant that it is easier than ever for organizations to export metadata for data sets programmatically directly from their own systems. Currently, the sources of metadata contained in the Fairdata services include the Language Bank of Finland, the Finnish Social Science Data Archive, the Finnish Environment Institute and the University of Jyväskylä. New sources of metadata will be added in a near future.

CSC’s development efforts together with the National Research Information Hub have increased organizations' interest in making their data sets available through Fairdata. As a result of this work, since 2021 it has also been possible to browse the metadata of research data sets contained in Fairdata in the Research.fi portal. Making data sets available through the portal expands their visibility and makes them easier to find. In the future, the Finnish Research Information Hub aims to compile the metadata of Finnish research data sets as extensively as possible and link it to other information describing research conducted in Finland, including publications, research infrastructures and funding decisions.

The efforts to develop these services have been speeded up by the launch of the Fairdata network. Through this network, organizations using the services can participate in developing the activities and receive peer support for deploying the services and developing their own processes.

Data management services for individual organizations improved

The EUDAT services provided by CSC complement the range of data management services with the possibility of tailoring the services for individual customers. They meet organizations’ or research infrastructures’ specific needs that the shared options do not cover. In 2021, two new tailored services were commissioned in production use for Finnish organizations (the Finnish Meteorological Institute and the University of Helsinki). Closer interoperability with the national service selection is also being developed. The aim is, in a near future, to link national research findings to the Fairdata metadata catalogue, making it possible to find research data published in the EUDAT service also this way.

CSC's EUDAT services are part of the European EUDAT infrastructure and thus also strongly involved in building the European Open Science Cloud (EOSC). The international environment has meant that significant amounts of EU funding have been available for developing and providing these services. The DICE project launched in 2021 provides funding for maintaining open services and deploying tailored services. CSC continues to play an important role in the governance of the European EUDAT infrastructure, an indication of which are CSC personnel members’ recent appointments to EUDAT roles: Per Öster was elected EUDAT CDI Council Chair in December 2020, and Antti Pursula was appointed Head of the EUDAT CDI Secretariat in March 2021.

Sustained efforts have been made to improve the quality of the EUDAT services, both at the national and the European level. CSC has focused on automation of software production processes. At the European level, common service level requirements have been defined for both internal and external service components. The purpose of these upgrades is to further improve the reliability and security of the EUDAT services.

Data warehouses for education support widespread data use and digitalisation of society

The data warehouses developed by CSC gather data on research, education and other public administration across a broad front and enable widespread exploitation of the data in different services and uses. These data warehouses originally built for the authorities' needs are increasingly recognized as essential sources of information, and other actors can build their digital services and operating processes by connecting to them. For example, the data are used by municipalities and joint municipal authorities, the Finnish National Agency for Education, the Finnish Student Health Service, the Social Insurance Institution, Statistics Finland, Helsinki Region Transport and the Employment Fund. The data in the National data repository for higher education institutions have also been used for such purposes as assessing the impacts of the COVID-19 pandemic (Finnish Education Evaluation Centre, Helsinki GSE Situation Room).

Oiva, a system administrated by three departments of the Ministry of Education and Culture, is the educational administration’ steering and regulation service that supports the executive steering of vocational upper secondary education, basic education and liberal adult education as well as basic education in the arts. It contains education providers’ licenses and authorizations as well as services for management by information. Oiva was developed for new production phases, and the service now contains 670 education providers’ licenses (an increase of 380% since 2020).

The data collection solutions implemented by CSC enable the Ministry of Education and Culture to allocate funding to higher education institutions and other education providers. Information on their education, research, personnel and finances are collected directly from all higher education institutions. CSC also successfully implemented a challenging software package for vocational education and training that comprised performance decision calculations and the freezing of data. Education Management Information Service Arvo is a survey data collection system tailored for the education administration. Among other things, the knowledge base collected by means of national surveys supports the allocation of funding to higher education institutions and VET providers. Arvo is used by 174 education providers, and through them, feedback or measurement results were collected from a total of 363,220 respondents/students (increase of 40% since 2020).

Data warehouse for early childhood education Varda, which is widely used in both municipal and national development efforts and decision-making, expanded to include data concerning 79,000 ECEC employees (increase of 1,029% since 2020). Varda’s interface for citizens was visited 35,000 times in 2021.

The data content of the Finnish National Board of Education’s national statistical service, Vipunen, expanded to include data on the acquisition of competence in vocational education and training (eHOKS, or personal competence development plans) and on pre-primary and basic education. The eHOKS data enable the education administration and education providers to monitor the fulfillment of legal requirements concerning students’ accumulation of competence and to develop student guidance, for example in apprenticeship and training agreements. The continuously updated data on pre-primary and basic education, on the other hand, support the network of providers at these levels of education and the education administration.

The development of a data collection and analysis service for the Finnish Education Evaluation Centre (FINEEC) was launched. Valssi, the Quality assessment system for early childhood education, will make it possible to collect and analyze data to support the development of early childhood education and care. Once Valssi has been completed, it will produce nationally consistent, reliable and cumulative monitoring data on the quality of early childhood education and care and make easy-to-use tools for quality management and self-assessment available locally for ECEC providers and private service providers.

Better findability of research data

The Finnish Research Information Hub collects data on research conducted in Finland into a single repository and provides open access to browsing them on the website Research.fi.  The separate Act on the Research Information Hub was passed in late 2021. CSC had contributed its expertise in research data management to the drafting of the Act. The Act enables the further use of data, for example in research organizations and funding providers' processes, reducing the need to enter the same data several times. The Finnish Research Information Hub also expanded to cover research data sets available through the national Fairdata.fi services.

The data content concerning research projects also expanded as, among other things, information on the EU Structural Funds was included in the repository. The Aurora database, which is a key Finnish information source for open calls for proposals related to research funding, was integrated into the Research Information Hub, creating significant cost savings through a reduced need to maintain different systems and expanding the user base of the web service Research.fi to grant applicants. Research.fi has attracted 6,000 to 7,000 visits every month.

Working together to provide services for society

CSC is involved as a technical expert, administrator and developer in many of Finnish society's unique information system projects, including the National Audiovisual Institute's Radio and Television Archive and the Digital Warehouse for films, the National Library’s Finna search service for cultural and scientific material, and certain digital services of the National Archives.

CSC maintains and develops the State Treasury’s Financial information service for municipalities, in which the municipalities’ financial information is collected and classified. In Exploreadimnistration.fi, the service portal of municipalities’ financial information, users can examine a wide range of financial information concerning both municipalities and the central government. In 2022, the service will be expanded to include the new wellbeing services counties.

CSC maintains and develops the information systems of Findata, the Finnish Social and Health Data Permit Authority. The development of services will continue. Findata’s remote access environment for researchers passed an information security audit carried out in March and April 2021.

Statistics Finland's remote use system, FIONA, is a secure environment for processing unit-level data sets needed in research, including Statistics Finland’s micro data. CSC is responsible for FIONA's technical maintenance. Statistics Finland's FIONA remote access environment was used to carry out the analyses of Helsinki GSE’s Situation Room.

The Ministry of Education and Culture and the Regional State Administrative Agency for Western and Inland Finland together with CSC implemented a national contact system for outreach youth work. The system was commissioned in September 2021. yhteysetsivaan.fi is a free national service through which a young person's identifying and contact information can be disclosed to the outreach youth work services of their home municipality smoothly and securely, observing the provisions of data protection regulations and the Accessibility Directive.

Back to top Go to Corporate Responsibility Report