Biblio-Networks¶
Intro to networks
- pyscisci.network.coauthorship_network(paa, focus_author_ids=None, focus_constraint='authors', temporal=False, show_progress=False)¶
Create the co-authorship network.
- Parameters:
paa (DataFrame) – A DataFrame with the links between authors and publications.
focus_author_ids (numpy array or list, default None) – A list of the AuthorIds to seed the coauthorship-network.
focus_constraint (str, default 'authors') –
- If focus_author_ids is not None:
’authors’ : the ‘focus_author_ids’ defines the node set, giving only the co-authorships between authors in the set.
’publications’ : the publication history of `focus_author_ids’ defines the edge set, giving the co-authorhips where at least one author from `focus_author_ids’ was involved.
’ego’ : the ‘focus_author_ids’ defines a seed set, such that all authors must have co-authored at least one publication with an author from `focus_author_ids’, but co-authorships are also found between the second-order author sets.
temporal (bool, default False) – If True, compute the adjacency matrix using only publications for each year.
show_progress (bool, default False) – If True, show a progress bar tracking the calculation.
- Returns:
coo_matrix or dict of coo_matrix –
- If temporal == False:
The adjacency matrix for the co-authorship network
- If temporal == True:
A dictionary with key for each year, and value of the adjacency matrix for the co-authorship network induced by publications in that year.
author2int, dict – A mapping of AuthorIds to the row/column of the adjacency matrix.
- pyscisci.network.cocitation_network(pub2ref, focus_pub_ids=None, focus_constraint='citing', cited_col_name='CitedPublicationId', citing_col_name='CitingPublicationId', temporal=False, show_progress=False)¶
Create the co-citation network.
- Parameters:
pub2ref (DataFrame) – A DataFrame with the links between authors and publications.
focus_pub_ids (numpy array or list, default None) – A list of the PublicationIds to seed the cocitation-network.
focus_constraint (str, default 'citing') –
- If focus_author_ids is not None
- -‘citing’the ‘focus_pub_ids’ defines the citation set, giving only the co-citations between the references
of the publications from this set.
-‘cited’ : the ‘focus_pub_ids’ defines the cocitation node set. -‘egocited’ : the ‘focus_pub_ids’ defines a seed set, such that all other publications must have been co-citeed with
at least one publication from this set.
cited_col_name (str, default 'CitedPublicationId') – The name of the cited value column in the DataFrame pub2ref
citing_col_name (str, default 'CitingPublicationId') – The name of the citing value column in the DataFrame pub2ref
temporal (bool, default False) – If True, compute the adjacency matrix using only publications for each year.
show_progress (bool, default False) – If True, show a progress bar tracking the calculation.
- Returns:
coo_matrix or dict of coo_matrix –
- If temporal == False:
The adjacency matrix for the co-citation network
- If temporal == True:
A dictionary with key for each year, and value of the adjacency matrix for the cocitation network induced by citing publications in that year.
pub2int, dict – A mapping of PublicationIds to the row/column of the adjacency matrix.
- pyscisci.network.cociting_network(pub2ref, focus_pub_ids=None, focus_constraint='citing', cited_col_name='CitedPublicationId', citing_col_name='CitingPublicationId', temporal=False, show_progress=False)¶
Create the co-citing network. Each node is a publication, two publications are linked if they cite the same article.
- Parameters:
pub2ref (DataFrame) – A DataFrame with the links between authors and publications.
focus_pub_ids (numpy array or list, default None) – A list of the PublicationIds to seed the cocitation-network.
focus_constraint (str, default 'citing') –
- If focus_author_ids is not None
- ’citing’the ‘focus_pub_ids’ defines the citation set, giving only the co-citations between the references
of the publications from this set.
’cited’ : the ‘focus_pub_ids’ defines the cocitation node set.
cited_col_name (str, default 'CitedPublicationId') – The name of the cited value column in the DataFrame pub2ref
citing_col_name (str, default 'CitingPublicationId') – The name of the citing value column in the DataFrame pub2ref
show_progress (bool, default False) – If True, show a progress bar tracking the calculation.
- Returns:
coo_matrix or dict of coo_matrix – The adjacency matrix for the co-citing network
pub2int, dict – A mapping of PublicationIds to the row/column of the adjacency matrix.
- pyscisci.network.estimate_resolution(G, com)¶
Newman, MEJ (2016) Community detection in networks: Modularity optimization and maximum likelihood are equivalent. Phy. Rev. E
- pyscisci.network.extract_multiscale_backbone(Xs, alpha)¶
A sparse matrix implemntation of the multiscale backbone.
References
Serrano et al. (2009) Extracting the multiscale backbone of complex weighted networks. PNAS.
- Parameters:
Xs (numpy.array or sp.sparse matrix) – The adjacency matrix for the network.
alpha (float) – The significance value.
- Returns:
The directed, weighted multiscale backbone
- Return type:
coo_matrix