Scientific production on data repositories and open science published in the Web of Science database:
Methodi Ordinatio and content analysis
Keywords:
Content analysis, Data repository, Open access, Open scienceAbstract
The opening of scientific data proposed by the Open Science movement presupposes careful planning for data collection, organization, and treatment, aiming at their sharing, accessibility, and reuse. Data repositories have been conceived as structures necessary to enable open access to data. This study aimed to analyze the influence of data repositories on the disclosure and sharing of scientific data proposed by the Open Science movement. The Methodi Ordinatio, developed to organize a portfolio of scientific publications, was adopted to analyze the subject of ‘Data Repositories’ and ‘Open Science’. The studies were ranked using the InOrdinatio index, and the 15 best ranked studies were included and analyzed through Bardin’s content analysis. Most studies describe the structure involved in data repositories within the biological, chemical, and health areas. Other studies addressed data reuse, data organization and analysis processes and tools, as well as data selection and classification algorithms. The units of analysis selected for the content analysis were categorized as open access, information technologies, data processing, and information retrieval. Systems (processes and structures), metadata standards, ontologies, semantic web, data types, and their management were addressed by these studies. It is concluded that open data repositories are growing rapidly. Production with the greatest impact has occurred in the biological and biomedical/health areas, highlighting the structure involved in repositories within these fields. Data repositories provide systems for depositing, managing, searching, accessing, and reusing data based on processes and technologies — often developed as open-source software — in alignment with the proposed Open Science model.
Downloads
References
Acharjya, D. P.; Kauser, A. P. A survey on big data analytics: challenges, open research issues and tools. International Journal of Advanced Computer Science and Applications, v. 7, n. 2, p. 511-518, 2016. Available from: https://thesai.org/Downloads/Volume7No2/Paper_67-A_Survey_on_Big_Data_Analytics_Challenges.pdf. Cited: Oct. 1st 2023.
Bardin, L. Análise de conteúdo. Lisboa: Edições 70, 2020.
Bertin, P.R.B. et al. A parceria para Governo Aberto como plataforma para o avanço da Ciência Aberta no Brasil. Transinformação, v. 31, e190020, 2019. Doi: http://dx.doi.org/10.1590/2318-0889201931e190020.
Bhattacharya, S. et al. ImmPort, toward repurposing of open access immunological assay data for translational and clinical research. Scientific Data, v. 5, e180015, 2018. Doi: https://.doi.org/10.1038/sdata.2018.15.
Bishop, L.; Kuula-Luumi, A. Revisiting qualitative data reuse: a decade on. SAGE Open, p. 1-15, 2017. Doi: https://.doi.org/10.1177/2158244016685136.
Budapest Open Access Initiative. Budapest Open Access Initiative. Budapeste: BOAI, 2002. Available from: http://www.opensocietyfoundations.org/openaccess/read. Cited: Oct. 1st 2023.
Carlson, S. Lost in a sea of science data. Chronicle of Higher Education, v. 52, n. 42, p. A35, 2006.
Chan, L. et al. Open Science Beyond Open Access: for and with communities. A step towards the decolonization of knowledge. Ottawa: The Canadian Commission for UNESCO’s IdeaLab., 2020.
European Commission. Open innovation, open science, open to the world: a vision for Europe. Publications Office. 2015. Doi: https://doi.org/doi/10.2777/061652.
Fecher, B.; Friesike, S. Open Science: One term, five schools of thought. In: Bartling, S.; Friesike, S. (ed.). Opening Science. New York: Springer, 2014. Doi: https://doi.org/10.1007/978-3-319-00026-8_2.
Gad, A.G. et al. An improved binary sparrow search algorithm for feature selection in data classification. Neural Computing and Applications, v. 34, p. 15705-15752, 2022. Doi: https://doi.org/10.1007/s00521-022-07203-7.
Greenberg, J. et al. A metadata best practice for a scientific data repository. Journal of Library Metadata, v. 9, n. 3-4, p. 194-212, 2009. Doi: https://doi.org/10.1080/19386380903405090.
Guo, X. et al. CNSA: a data repository for archiving omics data. Database, v. 2020, p. 1-6, 2020. Doi: https://doi.org/10.1093/database/baaa055.
Kearnes, S.M. et al. The open reaction database. Journal of the American Chemical Society, v. 143, p. 18820-18826, 2021. Doi: https://doi.org/10.1021/jacs.1c09820.
Kindling, M.; Strecker, D. List of Data Journals. Zenodo, Version 1.0., 2022. Doi: https://doi.org/10.5281/zenodo.7082125.
Lancaster, F.W. Information Retrieval Systems: Characteristics, Testing and Evaluation. 2. ed. Los Angeles: John Wiley & Sons, 1979. (Information Sciences Series).
Michener, W.K. Ecological data sharing. Ecological Informatics, v. 29, n. 1, p. 33-44, 2015. Doi: https://doi.org/10.1016/j.ecoinf.2015.06.010.
Moriya, Y. et al. The jPOST environment: an integrated proteomics data repository and database. Nucleic Acids Research, v. 47, n. 1, p. D1218-D1224, 2019. Doi: https://doi.org/10.1093/nar/gky899.
Nargesian, F. et al. Table union search on open data. Proceedings of the VLDB Endowment, v. 11, n. 7, p. 813-825, 2018. Doi: https://doi.org/10.14778/3192965.3192973.
Nosek, B.A. et al. Promoting an open research culture. Science, v. 348, n. 6242, p. 1422. 2015. Doi: https://doi.org/10.1126/science.aab2374.
Okuda, S. et al. jPOSTrepo: an international standard data repository for proteomes. Nucleic Acids Research, v. 45, n. D1, p. D1107-D1111, 2017. Doi: https://doi.org/10.1093/nar/gkw1080.
Pagani, R. N.; Kovaleski, J. L.; Resende, L.M. Methodi Ordinatio: a proposed methodology to select and rank relevant scientific papers encompassing the impact factor, number of citation, and year of publication. Scientometrics, v. 105, p. 2109-2135, 2015. Doi: https://doi.org/10.1007/s11192-015-1744-x.
Pampel, H. et al. Re3data: Indexing the global research data repository landscape since 2012. Scientific Data, v. 10, n. 1, p. 571. 2023. Doi: https://doi.org/10.1038/s41597-023-02462-y.
Piwowar, H.A.; Vision, T.J. Data reuse and the open data citation advantage. PeerJ, v. 1, e175, 2013. Doi: https://doi.org/10.7717/peerj.175.
Pontika, N. et al. Fostering Open Science to research using a taxonomy and an eLearning Portal. In: iKnow: 15th International Conference on Knowledge Technologies and Data Driven Business, 21-22., 2015, Graz, Austria. Proceedings […]. New York: Association for Computing Machinery, 2015. p. 1-8. Doi: http://dx.doi.org/doi:10.1145/2809563.2809571.
Sansone, S.A. et al. FAIRsharing as a community approach to standards, repositories and policies. Nature Biotechnology, v. 37, n. 4, p. 350-369, 2019. Doi: https://doi.org/10.1038.s41587-019-0080-8.
Sayão, L.F.; Sales, L.F. Plataformas de gestão de dados de pesquisa: expandindo o conceito de repositórios de dados. Palabra Clave, v. 12, n. 1, e171, 2023. Doi: https://doi.org/10.24215/18539912e171.
Sena, P.M.B.; Segundo, W.L.R.C.; Melo, B.A. Ciência Aberta como parceria para Governo Aberto: compromisso por um novo modelo de avaliação. Informação & Informação, v. 27, n. 3, p. 14-33, 2022. Doi: https://doi.org/10.5433/1981-8920.2022v27n3p14.
Shi, G. et al. DRAMP 3.0: an enhanced comprehensive data repository of antimicrobial peptides. Nucleic Acids Research, v. 50, n. D1, p. D488-D496, 2022. Doi: https://doi.org/10.1093/nar/gkab651.
Silva, F.C.C.; Silveira, L. O ecossistema da Ciência Aberta. Transinformação, v. 31, e190001, 2019. http://dx.doi.org/10.1590/2318-0889201931e190001
Silveira, L. et al. Taxonomia da Ciência Aberta: revisada e ampliada. Encontros Bibli, v. 28, e91712, 2023. Doi:https://doi.org/10.5007/1518-2924.2023.e91712.
Smedley, D. et al. The BioMart community portal: an innovative alternative to large, centralized data repositories. Nucleic Acids Research, v. 43, n. W1, p. W589-W598, 2015. Doi: https://doi.org/10.1093/nar/gkv350.
Strasser, C.; Abrams, S.; Cruse, P. DMPTool 2: expanding functionality for better data management planning. International Journal of Digital Curation, v. 9, n. 1, 2014. Doi: https://doi.org//10.2218/ijdc.v9i1.319.
Strasser, C. et al. Promoting data stewardship through best practices. In: Jones, M.B.; Gries, C. (ed.). Proceedings of the environmental information management conference. Santa Barbara: University of California, 2011. p. 126-131.
Strasser, C.; Cruse, P. The DMPTool and DataUp: helping researchers manage, archive, and share their data. Research Data Management Implementations Workshop, v. 13-14, 2013.
Sud, M. et al. Metabolomics Workbench: an international repository for metabolomics data and metadata, metabolite standards, protocols, tutorials and training, and analysis tools. Nucleic Acids Research, v. 44, n. D1, p. D463-D470, 2016. Doi: https://doi.org/10.1093/nar/gkv1042.
Taylor, J.R. et al. The Cambridge Centre for Ageing and Neuroscience (Cam-CAN) data repository: structural and functional MRI, MEG, and cognitive data from a cross-sectional adult lifespan sample. Neuroimage, v. 144, p. 262-269, 2017. Doi: https://doi.org/10.1016/j.neuroimage.2015.09.018.
Tenopir, C. et al. Data sharing by scientists: practices and perceptions. Plos One, v. 6, e21101, 2011. Doi: http://dx.doi.org/10.1371/journal.pone.0021101.
Ularu, E.G. et al. Perspectives on big data and big data analytics. Database Systems Journal, v. 3, n. 4, p. 3-14, 2012.
United Nations Educational, Scientific and Cultural Organization. An introduction to the UNESCO recommendation on Open Science. Paris: UNESCO, 2022. Available in: https://www.unesco.org/en/openscience. Cited: October 1st 2023.
United Nations Educational, Scientific and Cultural Organization. Towards a global consensus on open science: report on UNESCO´s global online consultation on open science. Paris: UNESCO, 2020. Available in: https://www.unesco.org/en/open-science. Cited: October 1st 2023.
United Nations Educational, Scientific and Cultural Organization. Recommendation on Open Science. Paris: 2021. Available in: https://www.unesco.org/en/open-science. Cited: October 1st 2023.
Vicente-Saez, R.; Martinez-Fuentes, C. Open Science now: a systematic literature review for an integrated definition. Journal of Business Research, v. 88, p. 428, 2018. Doi: https://doi.org/10.1016/j.jbusres.2017.12.043
Walters, W.H. Data journals: incentivizing data access and documentation within the scholarly communication system. Insights, v. 33, p. 18, 2020. Doi: https://doi.org/10.1629/uksg.510.
Wilkinson, M.D. et al. The FAIR guiding principles for scientific data management and stewardship. Scientific Data, v. 15, n. 3, p. 160018, 2016. Doi: https://doi.org/10.1038/sdata.2016.18.
Zheng, S. et al. DrugComb update: a more comprehensive drug sensitivity data repository and analysis portal. Nucleic Acids Research, v. 49, n. 1, p. 174-184, 2021. Doi: https://doi.org/10.1093/nar/gkab438.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Transinformação

This work is licensed under a Creative Commons Attribution 4.0 International License.



