Metadata-Version: 2.1
Name: gefpy
Version: 1.3.2
Summary: A thin, pythonic wrapper around geftool.
Home-page: https://github.com/STOmics/gefpy
Author: STOmics
Author-email: huangzhibo@genomics.cn
License: UNKNOWN
Platform: UNKNOWN
Classifier: Natural Language :: English
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: MacOS :: MacOS X
Classifier: Operating System :: Microsoft :: Windows
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Classifier: Topic :: Scientific/Engineering :: Visualization
Requires-Python: >=3.7
Description-Content-Type: text/markdown
Requires-Dist: h5py >=3.8.0
Requires-Dist: numpy >=1.20.0
Requires-Dist: matplotlib
Requires-Dist: seaborn
Requires-Dist: pandas
Requires-Dist: geojson
Requires-Dist: tifffile
Requires-Dist: opencv-python
Provides-Extra: docs
Requires-Dist: sphinx >=3.2 ; extra == 'docs'
Provides-Extra: test
Requires-Dist: pytest >=4.4 ; extra == 'test'
Requires-Dist: pytest-nunit ; extra == 'test'

### gefpy使用文档(gefpy is a thin, pythonic wrapper around geftools, which runs on Python 3 (3.7+).)
#### bgef_creater_cy模块
+ 对geftools中creat bgef的python接口
+ 相关接口
  ```
  .. py:module:: bgef_creater_cy

    .. py:class:: BgefCreater(thcnt=8)

    .. py:method:: create_bgef(self, strin, bin, strmask, strout)

        Create tisscuecut bgef by bgef/gem and mask.

        :param strin: raw bgef or bgem
        :param bin: mask binsize
        :param strmask: mask path
        :param strout: out path

    .. py:method:: get_stereo_data(self, strin, bin, strmask)

        Get tisscuecut stereo data by bgef/gem and mask. 

        :param strin: raw bgef or bgem
        :param bin: mask binsize
        :param strmask: mask path

        + uniq_cell is list that save all cell, each cell val (exp.x<<32 | exp.y).
        + gene_names is a list of gene names.
        + count is a list that save the midcnt of each expression.
        + cell_index is a list that save the cell idx of each expression.
        + gene_index is a list that records the gene serial number corresponding to each exp.


        :return: (uniq_cell, gene_names, count, cell_index, gene_index)
  ```
---
#### bgef_reader_cy模块
+ 提供了对gef文件的读取接口
+ 相关接口
  ```
    .. py:module:: bgef_reader_cy

    .. py:class:: BgefR(filepath, bin_size, n_thread)

    .. py:method:: get_expression_num(self)

    Get the number of expression.

    .. py:method:: get_cell_num(self)

    Get the number of cell.

    .. py:method:: get_gene_num(self)

    Get the number of gene.

    .. py:method:: get_gene_names(self)

    Get a list of gene names.

    .. py:method:: get_cell_names(self)

    Get a list of cell ids, each item is (exp.x<<32 | exp.y)

    .. py:method:: get_gene_data(self)

    Get gene data.

    + gene_index is a list that records the gene serial number corresponding to each exp.
    + gene_names is a list of gene names.

    :return: (gene_index, gene_names)

    .. py:method:: get_expression(self)

    Get the all expression from bgef. 

    + explist is a list, each item is (x, y, count, exon).

    :return: explist

    .. py:method:: get_exp_data(self)

    Get sparse matrix indexes of expression data.

    + uniq_cell is list that save all cell, each cell val (exp.x<<32 | exp.y).
    + cell_index is a list that save the cell idx of each expression.
    + count is a list that save the midcnt of each expression.

    :return: (uniq_cell, cell_index, count)

    .. py:method:: get_genedata_in_region(self, min_x, max_x, min_y, max_y, key)

    Get the explist by the specified gene name in the region.

    :param min_x: region minx
    :param max_x: region maxx
    :param min_y: region miny
    :param max_y: region maxy
    :param key: gene name
    :return: explist

    .. py:method:: get_offset(self)

    Get the offset in bgef.

    :return: (minx, miny)

    .. py:method:: get_exp_attr(self)

    Get the bgef attr.

    :return: (minx, miny, maxx, maxy, maxexp, resolution)

    .. py:method:: get_filtered_data(self, region, genelist)

    Get the filtered data from bgef by region or gene.

    :param region: rect region(minx,maxx,miny,maxy)
    :param genelist: gene name list

    + uniq_cell is list that save all cell, each cell val (exp.x<<32 | exp.y).
    + gene_names is a list of gene names.
    + count is a list that save the midcnt of each expression.
    + cell_index is a list that save the cell idx of each expression.
    + gene_index is a list that records the gene serial number corresponding to each exp.

    :return: (uniq_cell, gene_names, count, cell_index, gene_index)
  ```
---
#### bgef_writer_cy模块
+ bgef文件的写入功能的python接口
+ 相关接口
  ```
    .. py:module:: bgef_writer_cy

    .. py:function:: generate_bgef(input_file, bgef_file, stromics="Transcriptomics", n_thread = 8, bin_sizes = None, region = None)

    Function to generate common bin GEF file(.bgef).

    :param input_file:  The input file path of gem file or bin1 bgef.
    :param bgef_file:   Output BGEF filepath.
    :param stromics:    input the omics.
    :param n_thread:    Number of thread, default 8
    :param bin_sizes:   A list of bin sizes, default: 1,10,20,50,100,200,500
    :param region:      A list of region (minX, maxX, minY, maxY)

    .. py:function:: gem2tif(gempath, tif_path)

    Function to generate tif file by GEM file(.gem & .gem.gz).

    :param gempath:  The input file path of gem file.
    :param tif_path:   Output tiff filepath.
  ```
---
#### cgef_adjust_cy模块
+ 该模块提供了一些函数，包含lasso、mid count filter、渲染热图采样等功能，主要是stereo map使用
+ 相关接口
  ```
    .. py:module:: cgef_adjust_cy

    .. py:class:: CgefAdjust()

    .. py:method:: get_cell_data(self, bgef, cgef)

        Get raw cell data from cgef and bgef file.

        :param bgef: the bgef file path
        :param cgef: the cgef file path
        :returns: (genelist, vec_cell)

    .. py:method:: write_cgef_adjustdata(self, path, celldata, dnbdata)

        write the adjust cell data to cgef

        :param path: set the Output path
        :param celldata: input the cell data
        :param dandata: input the dandata

    .. py:method:: create_Region_Bgef(self, inpath, outpath, pos)

        generate spatial bin gef file by lasso region datas

        :param inpath: the bgef file path
        :param outpath: set the Output path
        :param pos: lasso region datas

    .. py:method:: create_Region_Cgef(self, inpath, outpath, pos)

        generate cell bin gef file by lasso region datas

        :param inpath: the cgef file path
        :param outpath: set the Output path
        :param pos: lasso region datas

    .. py:method:: get_regiondata_frombgef(self, inpath, bin, thcnt, pos)

        Get gene info from spatial bin gef file by lasso region datas

        :param inpath: the bgef file path
        :param bin: set bin size
        :param thcnt: thread counts
        :param pos: lasso region datas
        :returns vecdata: gene info{genecnt,midcnt,x,y} in region

    .. py:method:: get_regiondata_fromcgef(self, input_path, pos)

        Get cell statistical info from cell bin gef file by lasso region datas

        :param input_path: the cgef file path
        :param pos: lasso region datas
        :returns vecdata: statistical info{cell_count,total_area,average_gene_count,average_exp_count,average_dnb_count,average_area,median_gene_count,median_exp_count,median_dnb_count,median_area} in region

    .. py:method:: get_multilabel_regiondata_bgef(self, inpath, pos, bin=1, thcnt=4)

        The gene name and MIDcount of multiple labels are returned after the lasso

        :param inpath: the input bgef file path
        :param pos: lasso region datas(contain multi labels)
        :param bin: binsize
        :param thcnt: thread count
        :returns region_data, total_mid: region_data(gene_name, MIDcount), total midcount in region

    .. py:method:: get_multilabel_regiondata_cgef(self, inpath, pos)

        The gene name and MIDcount of multiple labels are returned after the lasso

        :param inpath: the input cgef file path
        :param pos: lasso region datas(contain multi labels)
        :returns vecdata, total_data: vecdata(cluster_id, mid_cnt, area, cell_id, x, y), total_data(cluster_id, mid_cnt, area, cell_id)

    .. py:method:: get_position_by_clusterid(self, inpath, clusterid)

        Get position value(x, y) by cluster id from h5ad file

        :param inpath: the input h5ad file
        :param clusterid: input cluster id need to get position
        :returns region_data: position value(x, y)

    .. py:method:: generate_filter_bgef_by_midcnt(self, inpath, outpath, binsize, filter_data, only_filter=False)

        generate complete bgef file by gene&protein mid count value

        :param inpath: input bgef file
        :param outpath: output bgef file
        :param binsize: current binsize
        :param only_filter: generate bgef only have filter gene&protein
        :param filter_data: filter gene&protein name and mid count
        :returns ret: generate result

    .. py:method:: get_filter_bgef_process_rate(self)

        Get generate process rate, Must be used in conjunction with the generate_filter_bgef_by_midcnt

        :returns ret: current process rate

    .. py:method:: generate_bgef_by_lasso(self, inpath, outpath, pos)

        generate complete bgef file by lasso region datas

        :param inpath: the input bgef file path
        :param outpath: the generate bgef file path
        :param pos: lasso region datas
        :returns ret: generate result

    .. py:method:: get_lasso_bgef_process_rate(self)

        Get generate process rate, Must be used in conjunction with the generate_bgef_by_lasso

        :returns ret: current process rate

    .. py:method:: generate_bgef_by_coordinate(self, inpath, outpath, cord, bin_size)

        generate bgef file by input coordinate in lasso region

        :param inpath: the bgef file path
        :param inpath: the bgef file path
        :param cord: lasso region datas
        :param bin_size: input binsize
        :returns ret: generate result

    .. py:method:: generate_cgef_by_coordinate(self, inpath, outpath, cord)

        generate cell bin gef file by coordinate in lasso region

        :param inpath: the input cgef file path
        :param outpath: the output cgef file path
        :param cord: lasso region datas
        :returns ret: generate result
  ```
---
#### cgef_reader模块
+ 提供了对cgef文件的读取功能
+ 相关接口
  ```
    .. py:module:: cgef_reader_cy

    .. py:class:: CgefR(filepath)

    .. py:method:: get_expression_num(self)

        Get the number of expression.

    .. py:method:: get_cell_num(self)

        Get the number of cell.

    .. py:method:: get_gene_num(self)

        Get the number of gene.

    .. py:method:: get_gene_names(self)

        Get a list of gene names. The type of gene name is 32 chars.

    .. py:method:: get_cell_names(self)

        Get an array of cell ids. Each cell id is (cell.x <<32 | cell.y)

    .. py:method:: get_cells(self)

        Get cells, each cell include (id, x, y, offset, geneCount, expCount, dnbCount, area, cellTypeID, clusterID)

    .. py:method:: get_genes(self)

        Get genes, each gene include(geneName, offset, cellCount, expCount, maxMIDcount)

    .. py:method:: get_cellid_and_count(self)

        Get the count of each cell in each gene.

        :return:  (cell_id, count)

    .. py:method:: get_geneid_and_count(self)

        Get the count of each gene in each cell.

        :return:  (gene_id, count)

    .. py:method:: get_cellborders(self)

        Gets cell borders.

        :return: borders_list

    .. py:method:: get_filtered_data(self, region, genelist)

        Get the filtered data from cgef by region or gene.

        :param region: rect region(minx,maxx,miny,maxy)
        :param genelist: gene name list

        + uniq_cell is list that save all cell, each cell val (exp.x<<32 | exp.y).
        + gene_names is a list of gene names.
        + count is a list that save the midcnt of each expression.
        + cell_index is a list that save the cell idx of each expression.
        + gene_index is a list that records the gene serial number corresponding to each exp.

        :return: (uniq_cell, gene_names, count, cell_index, gene_index)
  ```
---
#### cgef_writer_cy模块
+ 提供对cgef文件的写入
+ 相关接口
  ```
    .. py:module:: cgef_writer_cy

    .. py:function:: generate_cgef(cgef_file, bgef_file, mask_file, block_size: list)

    Generate cell bin GEF file from bgef + mask.

    :param cgef_file: Output CGEF filepath.
    :param bgef_file: Input BGEF filepath.
    :param mask_file: Input make filepath.
    :param block_size: Block size list, usually set to [256,256].

    .. py:function:: cgem_to_cgef(cgem_file, outpath, block_size: list)

    Generate cell bin GEF file from cgem.

    :param cgem_file: Input cgem path.
    :param outpath: Output cgef path.
    :param block_size: Block size list,  usually set to [256,256].
  ```
---
#### gef_to_gem_cy模块
+ 提供gem文件转gef文件的功能
+ 相关接口
  ```
    .. py:module:: gef_to_gem_cy

    .. py:class:: GefToGem(strout, strsn, boutexon)

    .. py:method:: bgef2gem(self, strbgef, binsize)

        Create bgem file by bgef.

        :param strbgef: the bgef file path
        :param binsize: set the binsize

    .. py:method:: cgef2gem(self, strbgef, binsize)

        Create cgem file by cgef and bgef.

        :param strcgef: the cgef file path
        :param strbgef: the bgef file path

    .. py:method:: bgef2cgem(self, strmask, strbgef)

        Create cgem file by mask and bgef.

        :param strcgef: the mask file path
        :param strbgef: the bgef file path
  ```



### API(also can see here)
See [here](https://gefpy.readthedocs.io/en/main/index.html). 

### Reporting bugs
Open a bug at https://github.com/STOmics/gefpy/issues.

