Smart Invoice Analysis System
Client: Moto Flota
Industry: Automotive
Completion Date: 2024
Short: A system supporting invoice and cost estimate settlements through intelligent document analysis (AI).

About the Client
Moto Flota is a leader in the car fleet management industry. The company currently manages over 65,000 vehicles and collaborates with an extensive network of over 880 service centers across the country.
Goal of the Project:
The goal of the project was to create and implement a system based on artificial intelligence for analyzing and extracting data from financial documents provided by service points working with the client. The data should include general information such as contractor data or invoice amount, as well as detailed data about the vehicle, individual parts and service details. System should work properly regardless of the document format (PDF, JPG) or data structure (since there is no standard format for invoices or cost estimates – every client uses their own).
Outcome
The system extracts cost estimates and invoices from an email inbox, processes them using artificial intelligence to retrieve specific information, and sends the resulting data to the client in a format ready for further use. The AI systems were built using large language models (LLM) and natural language processing (NLP). This approach enables the system to "understand" the documents and perform accurate analysis instead of relying on static content mapping. As a result, it handles new and unknown document types and data formats effectively.
The system also includes a notification feature that informs users about significant events (receiving an invoice, sending processed data, encountering errors, etc.) via various channels such as email or popular internet messengers.
The system allows user intervention in the process, enabling manual corrections when errors occur at different stages, such as lack of access to the email inbox, encrypted files, or low confidence in the data provided by the AI model.
The AI model gathers and analyzes results to learn and minimize errors and uncertainties in the output data over time.
Solutions
The system is based on a distributed network of microservices, operating independently in a "push-pull" architecture. Individual services communicate asynchronously using a queuing system via a shared communication bus.

Key advantages of this solution include:
- Flexibility – Each microservice has a unique responsibility, allowing for easy modifications and independent testing.
- Scalability – Adding new services doesn't affect the existing ones, reducing the cost and complexity of system expansion.
- Reliability – The queue operates in "persistent" mode, meaning messages are stored for a specified time. In case of a service failure, data remains intact and waits for the service to be restored, minimizing the risk of a single point of failure.
- Security – The limited number of system nodes exposed to external attacks enhances security.
Technologies
- Python
- Go
- Docker
- RabbitMQ
- React JS
- AI (LLM - Large Language Model, NLP - Natural Language Processing)
- Machine Learning
- Elastic Search, Kibana
Key Challenges
The client works with numerous external partners, each having its own format for invoices or cost estimates. These formats may change over time (e.g., font, data arrangement). Additionally, new contractors may introduce incompatible formats.
Document delivery methods also vary. Partners provide cost estimates as text-based PDFs, scanned PDFs, or image files (JPG, PNG, etc.). Documents may have different character encodings, or come from various operating systems and accounting software. These variables create virtually unlimited scenarios.
Static document analysis based on traditional algorithms would require frequent programming interventions to implement new strategies for handling exceptions or format changes. Such an approach would be cost-inefficient. Leveraging artificial intelligence became essential to "understand" the analyzed documents regardless of data representation.
Future Development
The system is currently being enhanced and expanded with additional features. The primary goal is to improve the AI model to deliver highly accurate data and independently handle as many exceptions as possible (e.g., atypical invoices).