About

The Science of Science (SciSci) is a growing field at the boundary of sociology, network science, and computational social science [1]. It encompasses diverse interdisciplinary research programs that study the processes underlying science [2]. The field has benefited greatly from access to massive digital databases containing the products of scientific discourse—including publications, journals, patents, books, conference proceedings, and grants. The subsequent proliferation of mathematical models and computational techniques for quantifying the dynamics of innovation and success in science has made it difficult to disentangle universal scientific processes from those dependent on specific databases, data-processing decisions, field practices, etc..

Here we present pySciSci for the analysis of large-scale bibliometric data. The package standardizes access to many of the most common datasets in SciSci and provides efficient implementations of common and advanced analytical techniques. The pySciSci package is intended for researchers of SciSci or those who wish to integrate large-scale bibliometric data into other existing projects.

By creating a standardized and adaptable programmatic base for the study of bibliometric data, we intend to help democratize SciSci, support diverse research efforts based on bibliometric datasets, and address calls for open access and reproducibility in the SciSci literature and community. We also encourage the SciSci community to contribute their own implementations, data, and use cases.

Funding

pySciSci acknowledges support from the following grants:

  • Air Force Office of Scientific Research Award FA9550-19-1-0354

  • Templeton Foundation Contract 61066