Biblio-Networks

Intro to networks

pyscisci.network.coauthorship_network(paa, focus_author_ids=None, focus_constraint='authors', temporal=False, show_progress=False)

Create the co-authorship network.

Parameters:
  • paa (DataFrame) – A DataFrame with the links between authors and publications.

  • focus_author_ids (numpy array or list, default None) – A list of the AuthorIds to seed the coauthorship-network.

  • focus_constraint (str, default 'authors') –

    If focus_author_ids is not None:
    • ’authors’ : the ‘focus_author_ids’ defines the node set, giving only the co-authorships between authors in the set.

    • ’publications’ : the publication history of `focus_author_ids’ defines the edge set, giving the co-authorhips where at least one author from `focus_author_ids’ was involved.

    • ’ego’ : the ‘focus_author_ids’ defines a seed set, such that all authors must have co-authored at least one publication with an author from `focus_author_ids’, but co-authorships are also found between the second-order author sets.

  • temporal (bool, default False) – If True, compute the adjacency matrix using only publications for each year.

  • show_progress (bool, default False) – If True, show a progress bar tracking the calculation.

Returns:

  • coo_matrix or dict of coo_matrix

    If temporal == False:

    The adjacency matrix for the co-authorship network

    If temporal == True:

    A dictionary with key for each year, and value of the adjacency matrix for the co-authorship network induced by publications in that year.

  • author2int, dict – A mapping of AuthorIds to the row/column of the adjacency matrix.


pyscisci.network.cocitation_network(pub2ref, focus_pub_ids=None, focus_constraint='citing', cited_col_name='CitedPublicationId', citing_col_name='CitingPublicationId', temporal=False, show_progress=False)

Create the co-citation network.

Parameters:
  • pub2ref (DataFrame) – A DataFrame with the links between authors and publications.

  • focus_pub_ids (numpy array or list, default None) – A list of the PublicationIds to seed the cocitation-network.

  • focus_constraint (str, default 'citing') –

    If focus_author_ids is not None
    -‘citing’the ‘focus_pub_ids’ defines the citation set, giving only the co-citations between the references

    of the publications from this set.

    -‘cited’ : the ‘focus_pub_ids’ defines the cocitation node set. -‘egocited’ : the ‘focus_pub_ids’ defines a seed set, such that all other publications must have been co-citeed with

    at least one publication from this set.

  • cited_col_name (str, default 'CitedPublicationId') – The name of the cited value column in the DataFrame pub2ref

  • citing_col_name (str, default 'CitingPublicationId') – The name of the citing value column in the DataFrame pub2ref

  • temporal (bool, default False) – If True, compute the adjacency matrix using only publications for each year.

  • show_progress (bool, default False) – If True, show a progress bar tracking the calculation.

Returns:

  • coo_matrix or dict of coo_matrix

    If temporal == False:

    The adjacency matrix for the co-citation network

    If temporal == True:

    A dictionary with key for each year, and value of the adjacency matrix for the cocitation network induced by citing publications in that year.

  • pub2int, dict – A mapping of PublicationIds to the row/column of the adjacency matrix.

pyscisci.network.cociting_network(pub2ref, focus_pub_ids=None, focus_constraint='citing', cited_col_name='CitedPublicationId', citing_col_name='CitingPublicationId', temporal=False, show_progress=False)

Create the co-citing network. Each node is a publication, two publications are linked if they cite the same article.

Parameters:
  • pub2ref (DataFrame) – A DataFrame with the links between authors and publications.

  • focus_pub_ids (numpy array or list, default None) – A list of the PublicationIds to seed the cocitation-network.

  • focus_constraint (str, default 'citing') –

    If focus_author_ids is not None
    • ’citing’the ‘focus_pub_ids’ defines the citation set, giving only the co-citations between the references

      of the publications from this set.

    • ’cited’ : the ‘focus_pub_ids’ defines the cocitation node set.

  • cited_col_name (str, default 'CitedPublicationId') – The name of the cited value column in the DataFrame pub2ref

  • citing_col_name (str, default 'CitingPublicationId') – The name of the citing value column in the DataFrame pub2ref

  • show_progress (bool, default False) – If True, show a progress bar tracking the calculation.

Returns:

  • coo_matrix or dict of coo_matrix – The adjacency matrix for the co-citing network

  • pub2int, dict – A mapping of PublicationIds to the row/column of the adjacency matrix.


pyscisci.network.estimate_resolution(G, com)

Newman, MEJ (2016) Community detection in networks: Modularity optimization and maximum likelihood are equivalent. Phy. Rev. E

pyscisci.network.extract_multiscale_backbone(Xs, alpha)

A sparse matrix implemntation of the multiscale backbone.

References

Serrano et al. (2009) Extracting the multiscale backbone of complex weighted networks. PNAS.

Parameters:
  • Xs (numpy.array or sp.sparse matrix) – The adjacency matrix for the network.

  • alpha (float) – The significance value.

Returns:

The directed, weighted multiscale backbone

Return type:

coo_matrix