Metadata-Version: 2.1
Name: linkedlabs
Version: 1.3.2
Summary: Get Similar customers (or rows) in data using DNA Matching Algorithms and Artificial Intelligence on your data!
Home-page: http://www.linkedlabs.co
Author: Nashit Babber
Author-email: nashit@linkedlabs.co
License: UNKNOWN
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: License :: OSI Approved :: GNU General Public License v2 or later (GPLv2+)
Classifier: Operating System :: OS Independent
Description-Content-Type: text/markdown
Provides-Extra: dev
License-File: LICENSE.txt

# linkedlabs
This project is created as a Generic solution to find the most similar rows(or customers) in a given dataset using DNA matching Algorithms and Artificial Intelligence.

This library automatically takes care of most of the data science stuff required such as 
- Data Cleansing
- Missing Value Treatment
- Variable Data Type prediction and casting
- Feature Engineering
- Feature Creation 
- and much more...

All you have to do is feed it **clean meaningful data.**

### What is Linkedlabs about
So suppose you have a bunch of customers and can't figure out which customers are similar to each other. For example, if you have a bunch of customers who have credit cards with them and you have another bunch of customers who do not have a credit card with them and you have some information regarding both of these customers such as their age, sex, salary etc  and want to know which customers from the database of customers that don't have credit card should I target for subscribing them to credit card, then this library can help you solve your issue.

This library will use an Algorithm that is used for DNA matching and will intelligently do all the data science stuff for you and churn out the customers who are most similar to the customers who already have a credit card with just one line of code!


##### Here's a youtube video explaining how to use this library!

[![Youtube Tutorial](http://linkedlabs.co/images/yt.jpeg)](https://youtu.be/b4amKD0NVvY "Youtube Tutorial")

##### Where else can you use it?
This algorithm can be used anywhere and everywhere you want to find similarities between any two things. Currently, Linkedlab is being used in the following industries:
- Healthcare
- Automobile
- Digital Security
- Education
- finance and Insurance
- Ecommerce 
- Marketing
- Recruitment
- Space Science
- And many more...

# Installation
Run the following to install:

```python
pip install linkedlabs
```

## Usage
Please note that this library accepts data as **pandas data frame ONLY**. So in whichever format, your data is you will have to convert it into a pandas data frame.

Okay, once converted into pandas data frame, there are two ways in which you can use this library.

1) You can find similarities between data of two data frames with **SAME** features(columns) by using the code snippet below : 
```python
from linkedlabs import get_similarities
import pandas as pd
df1 = pd.read_pickle("your_pandas_dataframe.pkl") ## Any pandas dataframe
df2 = pd.read_pickle("your_another_pandas_dataframe.pkl") ## Any other pandas dataframe having same columns(Features) as df1
df1,df2 = get_similarities(df1,df2)
```

and That's it. That's all you need to do! Linkedlabs takes care of the rest for you. Once it has finished all the calculations you should see a new column in your df1 named
```python
"best_similar_with_index_number"
```
concatenated. this column contains the index of the most similar row that it found in df2 corresponding to the row in df1. So basically, if you have (say for example 100 in the "best_similar_with_index_number" column) in front of the 1st row of df1 then, the 1st row of df1 is most similar to the 100th row of df2 and can be seen by simply using the following code : 
```python
df2.iloc[100] ## Just an example
```
2) The other way that you can use this library is by comparing df to itself (Don't worry, it will not throw the row itself as most similar, it will throw the next best similar row). Again, all you need is a single line of code, although  **it is highly recommended** that you create two separate df and compare them as mentioned in point number 1.

```python
from linkedlabs import get_similarities
import pandas as pd
df = pd.read_pickle("your_pandas_dataframe.pkl") ## Any pandas dataframe
df = get_internal_similarities(df)
```
and that's it! it will give you automatically create a column which will tell you the index of the most similar row to the corresponding row in the df.

### How to activate premium membership?
we do have a premium membership plan in which you can get similarities with an infinite amount of data, you will not be limited to 1000 rows as in the basic plan.
Now, If you have purchased our premium plan, well then you rock! and here's how you can activate your premium membership.
You will receive a file for your plan activaion, all you have to do is while running the code you need to specify the path of the file. something like this:

```python
from linkedlabs import get_similarities
import pandas as pd
df1 = pd.read_pickle("your_pandas_dataframe.pkl") ## Any pandas dataframe
df2 = pd.read_pickle("your_another_pandas_dataframe.pkl") ## Any other pandas dataframe
df1,df2 = get_similarities(df1,df2,activation_key_path="path/to/your/activation/key/file")
```

and that's it you will run the code as our premium member for as long as the subscription stays active!

for any questions/feedback feel free  [to mail us at hello@linkedlabs.co](mailto:hello@linkedlabs.co)


