content-extractor-pi

content-extractor-pi is a Python module which aims to extract a certain piece of content defined by the user in a set of documents. This piece of content can be a paragraph that deals with a certain topic, headers, page numbers et cetera.

PyPI

post-processor-pi

post-processor-pi is a Python module that exploits the output of Document AI (tool available on GCP) to create a JSON file that stores the text contained in a document in an organized way, so that it reflects its original structure.

PyPI

Multi-Class
Text Classification

Text multi-class classification project exploiting different techniques for vector representation of text and models.

Full Story

Dictionary-Based
Sentiment Analysis

Project related to my bachelor's thesis work that aims to compare different sentiment analysis dictionaries through their polarity assignment.

Full Story

About me

My name is Paolo Italiani and I'm currently based in Arezzo Italy. I am passionate about artificial intelligence techniques, excited to learn new things and implement them to solve real-world problems.

  • Coding

    Being a statistician R was the first programming language that I learned, nevertheless Python right now it's my favorite programming language. I also know a little bit of SQL and HTML.

  • Work Experience

    I worked six months with Qarik as a Data Scientist intern, during this period I helped the company enhancing its Document Ingestion skillset, developing two Python libraries for processing documents (summary PDF).

  • Education

    I'm a MSc student in Statistical Sciences at University of Bologna and and I have a Bachelor's degree in Statistical Sciences.

  • Curriculum Vitae

    You can find more details about me in my curriculum vitae last updated September 2021 here.