Architecture

Workflow

  1. Entry point

Users can enter their queries into the input box, which also includes some pre-filled text for convenience.

To ensure efficiency and relevance, the input first passes through a domain-checking sub agent. This agent verifies whether the question is related to the finance domain. The reason for this check is twofold:

  1. Running the full agentic pipeline on out-of-domain questions unnecessarily increases costs.
  2. The system is optimized specifically for finance-related prompts. Handling unrelated queries could lead to hallucinations or irrelevant responses, which may create the impression that the system is malfunctioning.

By filtering at the entry point, we maintain both performance and user trust.

  1. Breakdown sub agent

Since our agent needs proper direction on what the user wants, users will generally provide all the details in a single question, and in finance, this happens quite frequently. So, it is good to break down the question using some predefined flow, so that later we can use these sub-questions to let other agents reason about specific points and deliver good, detailed results.

  1. Finance data extractor

Financial reports often contain tabular information, and most engineering PDF parsers struggle with extracting this correctly. I tried several of them, but they returned scattered, meaningless data. In contrast, AI providers use state-of-the-art OCR models to process PDFs more effectively. I used Gemini to read the PDF content, which helped extract data in a format suitable for LLMs. This approach allowed me to retrieve the content more accurately, preserving the structure from the original PDF.

  1. Meeting transcription extractor

Meeting transcriptions are also in PDF format but typically do not contain tables. So I can use engineering based parser. I initially tried using various PDF parsers to extract the content, but in this case, OCR-based parsers were failing—they didn’t return the content as it appeared in the PDF. After trying various workarounds, I found that pdfplumber works very well for non-OCR-based PDF extraction. It preserved the structure and returned the text accurately.

  1. Latest results

In assignment it was mentioned to get latest price of stock, there is google search tool with google ADK. I have used it. Also prompted such a way that it gives 24-hour change, weekly change, monthly change, annual change, market capitalization, all-time-high, all-time-low and dividends.