Users can enter their queries into the input box, which also includes some pre-filled text for convenience.
To ensure efficiency and relevance, the input first passes through a domain-checking sub agent. This agent verifies whether the question is related to the finance domain. The reason for this check is twofold:
By filtering at the entry point, we maintain both performance and user trust.
Since our agent needs proper direction on what the user wants, users will generally provide all the details in a single question, and in finance, this happens quite frequently. So, it is good to break down the question using some predefined flow, so that later we can use these sub-questions to let other agents reason about specific points and deliver good, detailed results.
Financial reports often contain tabular information, and most engineering PDF parsers struggle with extracting this correctly. I tried several of them, but they returned scattered, meaningless data. In contrast, AI providers use state-of-the-art OCR models to process PDFs more effectively. I used Gemini to read the PDF content, which helped extract data in a format suitable for LLMs. This approach allowed me to retrieve the content more accurately, preserving the structure from the original PDF.
Meeting transcriptions are also in PDF format but typically do not contain tables. So I can use engineering based parser. I initially tried using various PDF parsers to extract the content, but in this case, OCR-based parsers were failing—they didn’t return the content as it appeared in the PDF. After trying various workarounds, I found that pdfplumber
works very well for non-OCR-based PDF extraction. It preserved the structure and returned the text accurately.
In assignment it was mentioned to get latest price of stock, there is google search tool with google ADK. I have used it. Also prompted such a way that it gives 24-hour change, weekly change, monthly change, annual change, market capitalization, all-time-high, all-time-low and dividends.