Metadata-Version: 2.4
Name: nlp_text_preprocessing
Version: 0.0.5
Summary: This is a Text Processing Package For NLP
Author: Uditya Narayan Tiwari
Author-email: tiwarimerit@gmail.com
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: spacy
Requires-Dist: textblob
Requires-Dist: beautifulsoup4
Requires-Dist: nltk
Requires-Dist: openpyxl
Requires-Dist: SpeechRecognition==3.10.4
Requires-Dist: pyaudio==0.2.14
Requires-Dist: PrettyTable
Requires-Dist: scikit-learn
Requires-Dist: wordcloud
Requires-Dist: lxml
Requires-Dist: pandas
Requires-Dist: numpy
Requires-Dist: matplotlib
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: license-file
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

# Text Preprocessing Python Package


#### Course Link: [Introduction to NLP](https://bit.ly/intro_nlp)

This Python package is created by [Uditya Narayan Tiwari](https://youtube.com/kgptalkie). It provides various text preprocessing utilities for natural language processing (NLP) tasks.

### Installation from PyPi
You can install this package using pip as follows:
```
pip install nlp_text_preprocessing
```

### Installation from GitHub
You can install this package from GitHub as follows:
```
pip install git+https://github.com/udityamerit/Text-Processing-Package-For-Natural-Language-Processing.git --upgrade --force-reinstall
```

### Uninstall the Package

To uninstall the package, use the following command:

```bash
pip uninstall nlp_text_preprocessing
```

### Requirements
You need to install these python packages.
```
python -m spacy download en_core_web_sm
spacy
textblob
beautifulsoup4
nltk
openpyxl
SpeechRecognition==3.10.4
pyaudio==0.2.14
PrettyTable
scikit-learn
wordcloud
lxml
pandas
numpy
matplotlib
```


## How to Use the Package

### 1. Basic Text Preprocessing

#### Lowercasing Text

```python
import nlp_text_preprocessing as tp

text = "HELLO WORLD!"
processed_text = tp.to_lower_case(text)
print(processed_text)  # Output: hello world!
```

#### Expanding Contractions

```python
import nlp_text_preprocessing as tp

text = "I'm learning NLP."
processed_text = tp.contraction_to_expansion(text)
print(processed_text)  # Output: I am learning NLP.
```

#### Removing Emails

```python
import nlp_text_preprocessing as tp

text = "Contact me at example@example.com"
processed_text = tp.remove_emails(text)
print(processed_text)  # Output: Contact me at 
```

#### Removing URLs

```python
import nlp_text_preprocessing as tp

text = "Check out https://example.com"
processed_text = tp.remove_urls(text)
print(processed_text)  # Output: Check out
```

#### Removing HTML Tags

```python
import nlp_text_preprocessing as tp

text = "<p>Hello World!</p>"
processed_text = tp.remove_html_tags(text)
print(processed_text)  # Output: Hello World!
```

#### Removing Special Characters

```python
import nlp_text_preprocessing as tp

text = "Hello @World! #NLP"
processed_text = tp.remove_special_chars(text)
print(processed_text)  # Output: Hello World NLP
```

### 2. Advanced Text Processing

#### Lemmatization

```python
import nlp_text_preprocessing as tp

text = "running runs"
processed_text = tp.lemmatize(text)
print(processed_text)  # Output: run run
```

#### Sentiment Analysis

```python
import nlp_text_preprocessing as tp

text = "I love programming!"
sentiment = tp.sentiment_analysis(text)
print(sentiment)  # Output: Sentiment(polarity=0.5, subjectivity=0.6)
```

#### Detecting and Translating Language

```python
import nlp_text_preprocessing as tp
from googletrans import Translator

translator = Translator()
text = "Bonjour tout le monde"
lang = tp.detect_language(text, translator)
translated_text = tp.translate(text, 'en', translator)
print(f"Language: {lang}, Translated: {translated_text}")
# Output: Language: fr, Translated: Hello everyone
```

### 3. Feature Extraction

#### Word Count

```python
import nlp_text_preprocessing as tp

text = "I love NLP."
count = tp.word_count(text)
print(count)  # Output: 3
```

#### Character Count

```python
import nlp_text_preprocessing as tp

text = "I love NLP."
count = tp.char_count(text)
print(count)  # Output: 9
```

#### N-Grams

```python
import nlp_text_preprocessing as tp

text = "I love NLP"
ngrams = tp.n_grams(text, n=2)
print(ngrams)  # Output: [('I', 'love'), ('love', 'NLP')]
```

### 4. Full Example: Cleaning Text

Hereâ€™s an example of how you might use several functions together to clean text data:

```python
import nlp_text_preprocessing as tp

text = "I'm loving this NLP tutorial! Contact me at https://www.linkedin.com/in/uditya-narayan-tiwari-562332289/  Visit https://udityanarayantiwari.netlify.app/"
cleaned_text = tp.clean_text(text)
print(cleaned_text)
# Output: i am loving this nlp tutorial contact me at visit
```

### One Short Feature Extraction
```python
import nlp_text_preprocessing as tp

tp.extract_features("I love NLP")
```

## Notes

- Be cautious when using heavy operations like `lemmatize` and `spelling_correction` on very large datasets, as they can be time-consuming.
- The package supports custom cleaning and preprocessing pipelines by using these modular functions together.












