Introduction
PyElit is also a library that allows classifying texts that deal with urban problems. For now, it is not possible to train the model with your own data.
Objective
The module aims to classify a text in one of the urban problems trained in the model: basic sanitation, traffic, works, and miscellaneous. The model was trained with television news reports: JPB Calendar of the JPB newscast of TV Cabo Branco affiliated with TV Globo.
It also allows the visualization of text documents of that topic, and also allows us to see the keywords for a given topic.
How to use ?
The TopicModeling
class is quite simple to use. Just import and instantiate an object of the class and then call the main method: rate_text
.
Let’s see some examples of how to use:
Topic Modeling: Classify a text
from pyelit import TopicModeling
topicModeling = TopicModeling()
result = topicModeling.rate_text("o ginásio da Escola Maria Honoriana Santiago está com obras paradas desde do início do ano.")
print("Topics and probabilities:", result)
print("Topic:", topicModeling.get_topic(r[0][0]))
Outputs for this example:
Topics and probabilities: [(2, 0.80940521), (0, 0.064506963), (1, 0.063506372), (3, 0.062581457)]
Topic: obras
Topic Modeling: Print topics
from pyelit import TopicModeling
topicModeling = TopicModeling()
print(topicModeling.print_topics())
Output for this example:
{0: 'saneamento', 1: 'trânsito', 2: 'obras', 3: 'diversos'}
Topic Modeling: Print keywords and their weights on each topic
from pyelit import TopicModeling
topicModeling = TopicModeling()
print(topicModeling.print_keywords(quant_max_palavras=2))
Output for this example:
[(0, '0.016*"água" + 0.015*"esgoto"'), (1, '0.025*"velocidad" + 0.024*"faixa"'), (2, '0.012*"escola" + 0.011*"obra"'), (3, '0.034*"estrada" + 0.015*"féria"')]
Topic Modeling: Change representativeness of topic names
from pyelit import TopicModeling
topicModeling = TopicModeling()
topicModeling.represent_topics([0, 1, 2, 3], ['Sanitation', 'Traffic','Construction', 'Several'])
print(topicModeling.print_topics())
Output for this example:
{0: 'Sanitation', 1: 'Traffic', 2: 'Construction', 3: 'Several'}
Topic Modeling: Imprimir um tópico por meio do id dele
from pyelit import TopicModeling
topicModeling = TopicModeling()
print("Topic with id = 1: " + topicModeling.get_topic(id_topic=1))
Output for this example:
Tópico com id = 1: Traffic