Creating an AI chatbot that helps understand the significance of a UN-gifted artifact

MehtA+
2 min readJul 17, 2024

--

By Ishan V, Sushmit C, Anirudh B, Jason H — MehtA+ AI/Machine Learning Research Bootcamp students

In a project in partnership with CUNY professor, Prof. Elizabeth Macaulay, high school students in MehtA+ AI/Machine Learning Research Bootcamp were provided with a United Nations Gifts Dataset and tasked to use AI to understand why? In part 5 of a seven part series, students explore ways in which AI can help us understand archaeological gifts better.

If you would like to learn more about MehtA+ AI/Machine Learning Research Bootcamp, check out https://mehtaplustutoring.com/ai-ml-research-bootcamp/.

*******************

Code: https://github.com/MehtaPlusTutoring/studentprojects/blob/main/aimlresearchbootcamp/2024/midterm/gemini.ipynb

Research Question:

Our primary research question was “Why is a specific artifact gifted by the UN considered significant?”. While there is information about these artifacts available on the UN’s website, they lack depth and are missing sources. This can create a problem for those interested in learning more about a specific artifact.

Our Solution:

To solve this problem, we decided to create an AI model that can help further the user’s knowledge of the specific artifact. We designed this model using Google Colab and wrote it in Python. We heavily used external Python libraries such as ChromaDB, langchain, sentence_transformers, -U langchain-community, BeautifulSoup, and Pandas. These libraries help extract information from various sources and put them into a form the computer can interpret. We then used Google’s Gemini API to create a response based on the collected data.

What worked:

One effective method involved scraping information from the UN Gifts website using the Python library Beautiful Soup. This approach successfully extracted relevant details about the artifacts.

Additionally, calling Google’s Gemini API to interpret the scraped information and provide a summarized response proved to be a valuable step in the process. The API’s capability to distill information helped in creating concise and meaningful interpretations of the artifacts’ significance.

Which method didn’t work:

Attempting to scrape information about the artifacts from various internet sources did not yield satisfactory results. This approach often led to incomplete or unreliable data, making it difficult to ascertain the true significance of the artifacts.

The web scraping method, while useful, required enhancements. It frequently encountered difficulties in locating reliable information and sometimes resorted to conjecture about the artifacts’ significance and the reasons for their presentation as gifts.

If you had more time, what would you do?

Given more time, we would consider expanding the scope of the project. This could involve incorporating related historical events or information that connect to the artifacts, providing a more comprehensive understanding and serving as a starting point for further research. This supplementary information would enhance the overall context and significance of the artifacts.

--

--

MehtA+
MehtA+

Written by MehtA+

MehtA+ is founded and composed of a team of MIT, Stanford and Ivy League alumni. We provide technical bootcamps and college consulting services.

No responses yet