During my internship at a D.C.-based institution earlier this year, I began working on a provenance project that involved organizing information, reviewing texts, and searching documents to discern if a collection of objects was either a loan or a donation. After eight weeks, I produced a final report and a final file index to support my findings and suggestions. During my Fall semester, the opportunity arose to research artificial intelligence (AI); I wondered how this technology could support provenance research workflows, speeding up the process of determining ownership and clearing up issues of intellectual property. My research included understanding how AI functions, recognizing the associated risks of AI, and conceptulaizing how AI would support provenance researchers. I initially found AI to be overwhelmingly vast and intimidating; I am now excited to see this tool become a reality for provenance researchers.
There are museums already employing AI for informative, and analytical data processing. For example, as of 2020, the American Museum of Natural History (AMNH) in New York used IBM’s Watson Natural Language Processor (NLP) and Google Cloud NLP to assess visitor surveys after they visited AMNH and reviews submitted through TripAdvisor to gauge visitor experiences. (Murphy, Oonagh, and Elena Villaespesa. The Museums + AI Network AI: A Museum Planning Toolkit. Goldsmiths, January 2020. https://themuseumsainetwork.files.wordpress.com/2020/02/20190317_museums-and-ai-toolkit_rl_web.pdf) The Metropolitan Museum of Art used AI image recognition software to assist in the metadata tagging process of their artworks, making their collection more accessible. (Murphy, Oonagh, and Elena Villaespesa. The Museums + AI Network AI: A Museum Planning Toolkit. Goldsmiths, January 2020. https://themuseumsainetwork.files.wordpress.com/2020/02/20190317_museums-and-ai-toolkit_rl_web.pdf) The Cooper Hewitt Smithsonian Design Museum used AI to “color” sort digital artworks online for online users to search artworks according to color palette. (Murphy, Oonagh, and Elena Villaespesa. The Museums + AI Network AI: A Museum Planning Toolkit. Goldsmiths, January 2020. https://themuseumsainetwork.files.wordpress.com/2020/02/20190317_museums-and-ai-toolkit_rl_web.pdf) This paper proposes the use of AI for provenance research due to aid in the completion of under documented objects in collections.
This paper is meant to motivate conversations in the museum community regarding AI-based provenance research. In this paper, I will:
- provide AI terminology to establish the industry vocabulary;
- discuss the benefits of using AI for provenance research;
- define provenance as it relates to the use of AI;
- discuss two of the main risks associated with AI as they pertain to provenance research;
- and provide other considerations for discussion.
I will illustrate the benefits, risks, as well as the practical and ethical considerations attributed to AI-based provenance research. It is not the intention of this paper to promote or rate specific software brands or companies. However, these may be mentioned in examples from other professional users and in my research findings, suggesting a process for use. The work in this paper is based on research generated in the field, outside researchers using AI, AI-focused news stories, and online contributors whose specialty and focus is AI ethics.
Intelligence defined.
Artificial Intelligence, or AI, is the simulation of human responses based on a series of step-by-step mathematical instructions or algorithms supplied to produce an answer. (Murphy, Oonagh, and Elena Villaespesa. The Museums + AI Network AI: A Museum Planning Toolkit. Goldsmiths, January 2020. https://themuseumsainetwork.files.wordpress.com/2020/02/20190317_museums-and-ai-toolkit_rl_web.pdf) Generative AI 1, an AI technology capable of creating text, images, and code based on data patterns, is trained on huge quantities of data predicting patterns to create new data. (Pasick, Adam. “Artificial Intelligence Glossary: Neural Networks and Other Terms Explained.” The New York Times, March 27, 2023, sec. Technology. https://www.nytimes.com/article/ai-artificial-intelligence-glossary.html.)(Coursera. “Artificial Intelligence (AI) Terms: A to Z Glossary,” June 15, 2023. https://www.coursera.org/articles/ai-terms.) Large Language Models (LLMs) process large amounts of data and utilize natural language processing, or software designed to understand and generate responses to human language input. (Pasick, Adam. “Artificial Intelligence Glossary: Neural Networks and Other Terms Explained.” The New York Times, March 27, 2023, sec. Technology. https://www.nytimes.com/article/ai-artificial-intelligence-glossary.html.) Due to the complexity of the human language, LLMs use fragments of words, or tokens, to plot language as numerical vector coordinates or word vectors. (Lee, Timothy B., and Sean Trott. “A Jargon-Free Explanation of How AI Large Language Models Work.” Ars Technica, July 31, 2023. https://arstechnica.com/science/2023/07/a-jargon-free-explanation-of-how-ai-large-language-models-work/.) Word vectors (a concept similar to global positioning systems or GPS) provides AI the extensive connections and pathways towards understanding human language and providing AI the ability to process requests in a virtual wordscape. (Lee, Timothy B., and Sean Trott. “A Jargon-Free Explanation of How AI Large Language Models Work.” Ars Technica, July 31, 2023. https://arstechnica.com/science/2023/07/a-jargon-free-explanation-of-how-ai-large-language-models-work/.)2 Machine learning (ML) is software that uses algorithms that learn from data, observations, and actions (Murphy, Oonagh, and Elena Villaespesa. The Museums + AI Network AI: A Museum Planning Toolkit. Goldsmiths, January 2020. https://themuseumsainetwork.files.wordpress.com/2020/02/20190317_museums-and-ai-toolkit_rl_web.pdf) and can be either descriptive, predictive, or prescriptive. (Brown, Sara. “Machine Learning, Explained | MIT Sloan.” Education. MIT Management Sloan School, April 21, 2021. https://mitsloan.mit.edu/ideas-made-to-matter/machine-learning-explained.)3 As the complexity of the AI algorithm increases, the ML functions and output become more complex. (Murphy, Oonagh, and Elena Villaespesa. The Museums + AI Network AI: A Museum Planning Toolkit. Goldsmiths, January 2020. https://themuseumsainetwork.files.wordpress.com/2020/02/20190317_museums-and-ai-toolkit_rl_web.pdf)4
Provenance, defined.
Provenance, as defined by the Getty Museum, is an object’s history of ownership as it connects its sources and origins establishing authenticity. (“Collecting & Provenance Research (Getty Research Institute).” Accessed November 30, 2023. https://www.getty.edu/research/tools/provenance/.) Due to early museum collection practices, there are accessioned objects in museums, whose history of ownership is under-documented or without a clear title.5 Due to a lack pof authentification, these objects cannot be used to their full potential and take up valuable space and resources while waiting for verification. Provenance research must be performed to distinguish ownership history which could also have implications on proprietorship. (Malaro, Marie C., and Ildiko Pogany DeAngelis. A Legal Primer on Managing Museum Collections. Third. Washington: Smithsonian Books, 2012. Kindle.) Considering the number of objects contained in any one museum and the resources required to research these objects, AI could prove to be an effective and efficient tool in aiding provenance gaps.
AI benefits the provenance research workflow.
Efficiency
The answer of ownership could be as simple as locating a single mis-filed loan document or as complex as sifting through reams of archival materials to find 18th century museum procedures for loans of unclaimed objects. Provenance research, based on my internship experience, requires hours of reviewing and analyzing many types of archival reference materials, creating checklists, establishing search criteria for online repositories and archives, transcribing, and composing documents that complete an object’s historical narrative. Aside from text summarization, AI can perform as the provenance researcher’s assistant by:
- providing writing prompts and feedback to aid composition; (Stapleton, Andy. “How To Write An A+ Essay Using AI in 3 Simple Steps.” YouTube video. 8:08. October 30, 2023. https://www.youtube.com/watch?v=EeMm-kaYgI0.)
- suggesting improvements to evidentiary findings and aiding in citation; (Stapleton, Andy. “How To Write An A+ Essay Using AI in 3 Simple Steps.” YouTube video. 8:08. October 30, 2023. https://www.youtube.com/watch?v=EeMm-kaYgI0.)
- creating rubrics for final reports; (Stapleton, Andy. “How To Write An A+ Essay Using AI in 3 Simple Steps.” YouTube video. 8:08. October 30, 2023. https://www.youtube.com/watch?v=EeMm-kaYgI0.)
- and proofing interoffice memorandums regarding provenance. (Christou, Prokopis. “How to Use Artificial Intelligence (AI) as a Resource, Methodological and Analysis Tool in Qualitative Research?” The Qualitative Report, July 31, 2023. https://doi.org/10.46743/2160-3715/2023.6406.)
Digitizing Early Texts
Considering the improvements made in neural networks and natural language processing, document texts can be scanned, converted, and inputted into AI applications to convert into machine readable documents prior to or at the start of an AI-based provenance search. (“The Evolution of Document Scanning - How Can AI Help You? | ITS Group.” Accessed November 25, 2023. https://www.its-group.com/news/story/the-evolution-of-document-scanning-how-can-i-help-you.) Prior to producing machine readable documents through AI, documents are converted to readable text through the Adobe Portable Document File (PDF) Professional application or scanner via the Optical Content Recognition, or OCR, software – although the outcome is not always correct due document condition or choice of sans serif typeface.6 Training AI to transcribe printed documents is a time-saving option that may provide researchers with digital sources to easily search and review with AI software.
Analysis
Part of the provenance process requires reading biographical and subject matter texts. These materials may be processed through generative AI programs (either copy pasted or via uploaded PDF file) and, when prompted, can quickly summarize and organize information into digestible bullet points for quick review. Christou reports AI detects “key concepts” in documents, saving researchers time they would otherwise spend sifting through dozens of pages of text. (Christou, Prokopis. “How to Use Artificial Intelligence (AI) as a Resource, Methodological and Analysis Tool in Qualitative Research?” The Qualitative Report, July 31, 2023. https://doi.org/10.46743/2160-3715/2023.6406.) This aspect of AI could provide provenance researchers a means to prioritize evidence and acquire key information that would be otherwise overlooked afters hours of reading through multiple texts. When prompted, AI can recommend courses of action and other concepts or facts to research as part of its predictive nature. (Christou, Prokopis. “How to Use Artificial Intelligence (AI) as a Resource, Methodological and Analysis Tool in Qualitative Research?” The Qualitative Report, July 31, 2023. https://doi.org/10.46743/2160-3715/2023.6406.)
Within the provenance workflow, AI can aid in assessing evidence, based on the established AI models supplied, regarding the final determination regarding an object’s history of ownership. Agrawal, Gans, and Goldfarb state using AI as part of the workflow prevents under-testing and increases the need for additional labor for downstream tasks. (Agrawal, Ajay, Joshua S. Gans, and Avi Goldfarb. “Artificial Intelligence: The Ambiguous Labor Market Impact of Automating Prediction.” Journal of Economic Perspectives 33, no. 2 (May 1, 2019): 31-50. https://doi.org/10.1257/jep.33.2.31.) Once a provenance researcher becomes attuned to the AI workflow, transparency and accountability may be expressed through published findings and accession file update.
Mitigating Bias and Inaccuracy
All technologies have risks. (“The History of Artificial Intelligence (4K) | CyberWork And The American Dreams | Spark.” YouTube video. 55:37. July 13, 2022. https://www.youtube.com/watch?v=q6U9mhKAFFA.) By investigating and mitigating risks now, museums can prepare information and AI models which yield accurate and ethical results. Two of the main risks that affect the trustworthiness of AI-based provenance research are bias and accuracy.
Bias
AI bias occurs when the training data retains human biases and responds with biased results. (Villaespesa, Elena, and Ariana French. “AI, Visitor Experience, and Museum Operations: A Closer Look at the Possible.” 101-13, 2019.) Because biases exist in society, human-generated data is also biased. AI outputs are biased if the programmer creating the AI algorithms used to generate data has been inserting their own biases into the mix. (Biddle, Sam. “The Internet’s New Favorite AI Proposes Torturing Iranians and Surveilling Mosques.” The Intercept, December 8, 2022. https://theintercept.com/2022/12/08/openai-chatgpt-ai-bias-ethics/.) During the human review period or AI evaluation, the reviewer can insert their own biases when AI is rated. (Reagan, Mary. “Understanding Bias and Fairness in AI Systems.” Medium, April 2, 2021. https://towardsdatascience.com/understanding-bias-and-fairness-in-ai-systems-6f7fbfe267f3.)
AI bias can be introduced at any point during an AI workflow so museums should be prepared to deal with AI bias during the course of AI-based research. Historical biases are those prejudicial concepts directed at marginalized populations. (Reagan, Mary. “Understanding Bias and Fairness in AI Systems.” Medium, April 2, 2021. https://towardsdatascience.com/understanding-bias-and-fairness-in-ai-systems-6f7fbfe267f3.) For example, a historical bias that men are doctors and women are nurses is an example found in earlier versions of Google Translate, an application used for language translation. (Brannon, Isabella, Allen Su, Caitlyn Vergara, and John Villasenor. “AI & Bias - When Algorithms Don’t Work,” YouTube video. 7:13. September 14, 2022. https://www.youtube.com/watch?v=FD-4yC95iZY.) Representation bias occurs when populations are inaccurately represented or the dataset of representing populations is insufficient. (Reagan, Mary. “Understanding Bias and Fairness in AI Systems.” Medium, April 2, 2021. https://towardsdatascience.com/understanding-bias-and-fairness-in-ai-systems-6f7fbfe267f3.) In matters of racial representation, most of the data collected is significantly uneven towards one race, gender, or socioeconomic level - the variation across levels is either negligible or non-existent, leaving the AI little information to compare and contrast. (Reagan, Mary. “Understanding Bias and Fairness in AI Systems.” Medium, April 2, 2021. https://towardsdatascience.com/understanding-bias-and-fairness-in-ai-systems-6f7fbfe267f3.) Measurement bias occurs when selecting data with specific labels and features. (Reagan, Mary. “Understanding Bias and Fairness in AI Systems.” Medium, April 2, 2021. https://towardsdatascience.com/understanding-bias-and-fairness-in-ai-systems-6f7fbfe267f3.) A case using measurement bias, reported in ProPublica’s article “Machine Bias,” discusses the use of flawed recidivism rate data, in “predictive policing” where, despite their records and offenses, Black people were subject to harsher sentences than white people. (Reagan, Mary. “Understanding Bias and Fairness in AI Systems.” Medium, April 2, 2021. https://towardsdatascience.com/understanding-bias-and-fairness-in-ai-systems-6f7fbfe267f3.)(Angwin, Julia, Jeff Larson, Lauren Kirchner, and Surya Mattu. “Machine Bias.” ProPublica. Accessed November 14, 2023. https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing.) AI modeling, or training of data for AI, can be subject to evaluation bias as the model is improperly used for other points of reference. (Reagan, Mary. “Understanding Bias and Fairness in AI Systems.” Medium, April 2, 2021. https://towardsdatascience.com/understanding-bias-and-fairness-in-ai-systems-6f7fbfe267f3.)
As decisions include object histories, the data supplied to AI must be reliable and fair, and defined before deployment within the AI project cycle. When AI results seem unfair, the data and results should be reviewed and explained; if either cannot be explained, there may be a bias involved. (TDS Editors. “Why Eliminating Bias in AI Systems Is So Hard.” Medium, October 28, 2021. https://towardsdatascience.com/why-eliminating-bias-in-ai-systems-is-so-hard-97e4f60ffe93.) During the evaluation point of the AI project cycle, biases will have to be mapped to determine where in the process these biases are being introduced.
Museum staff can collectively and openly discuss the definitions of bias relying on research of AI ethics professionals. (Dikow, Rebecca B, Corey DiPietro, Michael G Trizna, Hanna BredenbeckCorp, Madeline G Bursell, Jenna T B Ekwealor, Richard G J Hodel, et al. “Developing Responsible AI Practices at the Smithsonian Institution.” Research Ideas and Outcomes 9 (October 25, 2023): e113334. https://doi.org/10.3897/rio.9.e113334.) For instance, Smithsonian Institution staff formed an AI reading group who reviewed and discussed research-focused articles prior to developing their own AI policies. (Dikow, Rebecca B, Corey DiPietro, Michael G Trizna, Hanna BredenbeckCorp, Madeline G Bursell, Jenna T B Ekwealor, Richard G J Hodel, et al. “Developing Responsible AI Practices at the Smithsonian Institution.” Research Ideas and Outcomes 9 (October 25, 2023): e113334. https://doi.org/10.3897/rio.9.e113334.) This collaborative activity provided an open forum for Smithsonian staff to explore ethical issues regarding AI bias in a museum environment.
Another way to mitigate the effects of unbiased data is to become a trained ethical creator of data. (TDS Editors. “Why Eliminating Bias in AI Systems Is So Hard.” Medium, October 28, 2021. https://towardsdatascience.com/why-eliminating-bias-in-ai-systems-is-so-hard-97e4f60ffe93.) With the proper training, provenance researchers can reduce the biases in their data. For better control of data creation and management, Will Keefe suggests training in Python, an open-source programming language incorporating and maintaining standards of ethical data creation. (TDS Editors. “Why Eliminating Bias in AI Systems Is So Hard.” Medium, October 28, 2021. https://towardsdatascience.com/why-eliminating-bias-in-ai-systems-is-so-hard-97e4f60ffe93.)6
Researchers should take an “active role” in verifying AI by cross-referencing results confirming AI outputs against other sources, analyzing connections with other areas of research, and checking for discrepancies. (Christou, Prokopis. “How to Use Artificial Intelligence (AI) as a Resource, Methodological and Analysis Tool in Qualitative Research?” The Qualitative Report, July 31, 2023. https://doi.org/10.46743/2160-3715/2023.6406.) This AI proofing checks the accuracy and trustworthiness of the researcher’s findings. (Christou, Prokopis. “How to Use Artificial Intelligence (AI) as a Resource, Methodological and Analysis Tool in Qualitative Research?” The Qualitative Report, July 31, 2023. https://doi.org/10.46743/2160-3715/2023.6406.) Christou suggests that data collection should be a varied process, making it easier to verify AI-based findings when using the cross-referencing method. (Christou, Prokopis. “How to Use Artificial Intelligence (AI) as a Resource, Methodological and Analysis Tool in Qualitative Research?” The Qualitative Report, July 31, 2023. https://doi.org/10.46743/2160-3715/2023.6406.) This means verifying AI data through physical texts, online research, and internal documents comparing against data from what AI outputs.
AI-based provenance research documentation can be accomplished a couple of ways; through noting specific AI findings through interoffice memos, project notes, and citing the AI software using the Chicago Manual of Style. (The Chicago Manual of Style Online. “The Chicago Manual of Style, 17th Edition.” Accessed November 14, 2023. https://www.chicagomanualofstyle.org/.) As part of the AI workflow, this transparency details areas where human and AI data is used, benefiting the museum by documenting the quality of the museum’s AI datasets and modelling. (Dikow, Rebecca B, Corey DiPietro, Michael G Trizna, Hanna BredenbeckCorp, Madeline G Bursell, Jenna T B Ekwealor, Richard G J Hodel, et al. “Developing Responsible AI Practices at the Smithsonian Institution.” Research Ideas and Outcomes 9 (October 25, 2023): e113334. https://doi.org/10.3897/rio.9.e113334.)
Provenance researchers should also refer nationally recognized groups who have considered these same issues. There are organizations who provide frameworks of support available in the navigation of ethical AI tools. The National Institute of Standards and Technology (NIST) suggests that subject matter experts should work within AI “teams” to resolve issues regarding “alignment and deployment conditions.” (NIST AIRC. “NIST AIRC - AI RMF Core.” Accessed October 19, 2023. https://airc.nist.gov/AI_RMF_Knowledge_Base/AI_RMF/Core_And_Profiles/5-sec-core.)
Accuracy
Given the complexity of human language, user error, data sourcing by the software, and the developer’s algorithms, AI has been known to respond incorrectly to a researcher’s prompts. Based on my personal experiments with ChatGPT, Claude 2, and Google Bard, the responses have been thorough and clear, but there have been some errors (the application recommended imaginary sources) mixed within the response for each of these applications. Using the text of the Department of the Interior’s Native American Graves Protection and Repatriation Act (NAGPRA) and the United Nations’ Declaration of Indigenous People, I prompted the various AI applications to answer questions regarding museum repatriation based on the text. My experiments provided insight into the need for proper staff training when using these applications - I have never used AI prior to writing this paper so the initial interaction was interesting. Because I am familar with both documents, I was able to determine the accuracy of the AI responses to my experimental prompts.
Hallucinations or confabulations of information occur either due to insufficient training data or lack of appropriate source data to provide a correct response. (Matsakis, Louise. “Artificial Intelligence May Not ‘Hallucinate’ After All.” Wired. Accessed November 6, 2023. https://www.wired.com/story/adversarial-examples-ai-may-not-hallucinate/.) There are a number of circumstances where AI software could answer the same prompt differently and incorrectly in the same session. (Weise, Karen, and Cade Metz. “When A.I. Chatbots Hallucinate.” The New York Times, May 1, 2023, sec. Business. https://www.nytimes.com/2023/05/01/business/ai-chatbots-hallucination.html.) The reason for inaccuracy is the predictive nature of the software as the algorithm detects and produces patterns in various ways. (Weise, Karen, and Cade Metz. “When A.I. Chatbots Hallucinate.” The New York Times, May 1, 2023, sec. Business. https://www.nytimes.com/2023/05/01/business/ai-chatbots-hallucination.html.) As provenance research uses many sources of information to prove or disprove gaps in ownership, hallucinations, for the time being, will require additional steps to fact-check using designated datasets and models created for specific research purposes creating a closed system approach that avoids outside biased data. (OpenAI. “GPT-4 System Card,” March 23, 2023.)
Some Considerations
Education
With education and training in AI, abundant on the web, museums interested in using AI for research purposes should consider standardized training in use so staff are prepared to properly use the software to get the best result. Github and OpenAI have resources to prepare museum staff but it would be best to come to an agreement on what resources and authorities should be involved in the process of training. (“Build Software Better, Together.” n.d. GitHub. Accessed December 2, 2023. https://github.com.)(OpenAI Platform.” n.d. Accessed December 2, 2023. https://platform.openai.com.) Provenance researchers who do not use properly trained AI applications (or choose incorrect datasets) risk inaccurate and biased results if they themselves are not properly trained.
AI Workflows
Museums suffer from a lack of “TMPR”, or time, money, people, and resources. (Weisberg, Robert J. “Is AI the Right or Wrong Solution to the Right or Wrong Problem?” Museum Human, October 24, 2023. https://www.museumhuman.com/is-ai-the-right-or-wrong-solution-to-the-right-or-wrong-problem/.) AI project cycles map out benchmarks that maximize the benfits of AI to get more return on investment. (Maddula, Surya. “The AI Project Cycle.” Medium (blog), November 4, 2021. https://suryamaddula.medium.com/the-ai-project-cycle-e363ce3f4f6f.)
For the purpose of project efficiency, an AI project cycle should include:
- problem identification;
- project scope;
- data acquisition;7
- data exploration where patterns in information are explored using the 4 W’s;8
- AI Modelling is fed into the AI;9
- evaluation of A;
-
and deployment of AI or integration.
(Maddula, Surya. “The AI Project Cycle.” Medium (blog), November 4, 2021. https://suryamaddula.medium.com/the-ai-project-cycle-e363ce3f4f6f.)
Once deployment is underway, an established provenance research workflow is important to prevent project derailment, establish focus, and manage museum resources. Museums should consider how their provenance research workflow may involve AI and what data sources will be permitted for use.
The project scope determines what information is lacking within the object file and, using the AI project cycle, determines a means to expedite the process and deploys the model for use within the provenance research workflow. Part of the Smithsonian Institution’s AI Values Statement requires staff to consider if AI is appropriate for solving the problem (project scope). (“AI Values Statement.” AI Values Statement | Smithsonian Data Science Lab, 2022. https://datascience.si.edu/ai-values-statement.) Provenance research workflows should include what the final deliverables will include and how due diligence is proven. Once provenance decisions are finalized and additions are made to an object’s file, AI usage should be publicly indicated within the provenance record. The Smithsonian Institution supports documenting “how the AI content was produced” and delineating between “human” and “AI” generated content. (“AI Values Statement.” AI Values Statement | Smithsonian Data Science Lab, 2022. https://datascience.si.edu/ai-values-statement.) According to the Carnegie Museum of Art provenance standard, a footnote or a note at the end of the provenance record could be used to note AI-use. (“Art Tracks | Art Tracks.” Accessed September 6, 2023. http://www.museumprovenance.org/.) Until a standardization for AI is created, these additions can provide transparency of use for museums.
Matters of Privacy
Museum staff must consider how to handle matters regarding personal and museum data when conducting AI-based provenance research. While using AI applications, such as ChatGPT, data is collected from its users such as interaction data, browser settings, and internet protocol, or IP addresses, “conversation titles,” and “chat histories” which are made available to third parties during use. (Khowaja, Sunder Ali, Parus Khuwaja, and Kapal Dev. “ChatGPT Needs SPADE (Sustainability, Privacy, Digital Divide, and Ethics) Evaluation: A Review.” arXiv, April 13, 2023. http://arxiv.org/abs/2305.03123.) This means internal documents that hold donor data, museum operational information and others could be made public on the web. Museum management should carefully review the application’s privacy policy during the vetting process to ensure they align with the museum’s ethical standards and practices.
There are ways to protect AI usage that include tokenization or replacing sensitive data with non-sensitive tokens making data unavailable to unauthorized users, and (similar to tokenization) data masking, or the scrambling of data while maintaining its basic structure. (Takyar, Akash. “Data Security in AI Systems.” LeewayHertz - AI Development Company, June 19, 2023. https://www.leewayhertz.com/data-security-in-ai-systems/.) These options may require additional research to integrate into current AI project cycles.
Within the museum environment, provenance researchers can mitigate privacy concerns by:
- using models solely trained on local devices locally within the institution’s protected servers;
- instituting security measures that protect personal and museum data during use;
- creating and enforcing ethical data usage policies that include using biased data, user data, and data sharing;
-
or training researchers of the implications of non-adherence
to museum data privacy policies.
(Khowaja, Sunder Ali, Parus Khuwaja, and Kapal Dev. “ChatGPT Needs SPADE (Sustainability, Privacy, Digital Divide, and Ethics) Evaluation: A Review.” arXiv, April 13, 2023. http://arxiv.org/abs/2305.03123.)
Indigenous Cultures
Another provenance research use of AI is in the determination for object repatriation for Indigenous cultures for the purposes of NAGPRA. Using AI for provenance determinations of Indigenous cultural property is not appropriate without a conversation with the affected cultural group. (Lewis, Jason Edward, Angie Abdilla, Noelani Arista, Kaipulaumakaniolono Baker, Scott Benesiinaabandan, Michelle Brown, Melanie Cheung, et al. “Indigenous Protocol and Artificial Intelligence Position Paper.” Monograph. Honolulu, HI: Indigenous Protocol and Artificial Intelligence Working Group and the Canadian Institute for Advanced Research, 2020. https://doi.org/10.11573/spectrum.library.concordia.ca.00986506.) A position paper, written by The Indigenous Protocol and Artificial Intelligence Working Group, has been created for those interested in learning more about different cultural positions regarding AI use. Using AI in cases for determining and researching Indigenous cultures should also be addressed in the museum’s ethical stewardship policies within its collection management plan.
Conclusion
Museums stand to benefit significantly as backlogs of ownership investigations are resolved from AI-based provenance research; collections become more available for exhibition, study, and programming. AI-based provenance research could positively impact collection object access by providing provenance researchers a means of processing large amounts of information in a shorter timeframes.
As mentioned, AI technology contains risks that can negatively affect a museum’s authority over its collection. Understanding the AI project cycle is essential to identifying how the different bias types are introduced can help provenance researchers to avoid them. By proactively mitigating bias, museums can confidently and openly use AI for provenance research. Education and focused training in data creation and LLM usage can help subvert most inaccurate AI outputs. Transparently documenting AI use, auditing for biases, controlling data inputs, and adhering to emerging best practices around ethical AI development, improve AI models for continued use for both the provenance researcher and the museum.
AI technology is rapidly evolving; museums can establish responsible use by clearly defining project scopes, workflows, and intended applications. AI is an incredibly intense and intimidating application that could potentially hinder productivity and result in faulty responses if not used properly. Learning about the elements of AI in a group ensures staff is on the same page for developing policies, training, and ethical standards. Key terminology has been defined to provide a means of navigating and encouraging discussions regarding AI integration into a provenance workflow. The ultimate goal for AI-based provenance research is developing a technological tool that supports provenance researchers in acquiring the information needed to make underdocumented objects valued as educational tools for the public.
Notes
-
ChatGPT, Claude 2, and Google Bard are examples of large language models (LLMs).(Coursera. “Artificial Intelligence (AI) Terms: A to Z Glossary,” June 15, 2023. https://www.coursera.org/articles/ai-terms.) ↩︎
-
This definition makes sense when considering early computing where binary language, a language of ones and zeros, was the “early” linear vector for simple computer operations. Word vectors indicate general human word relationships represented as contextual numerical coordinates.(Lee, Timothy B., and Sean Trott. “A Jargon-Free Explanation of How AI Large Language Models Work.” Ars Technica, July 31, 2023. https://arstechnica.com/science/2023/07/a-jargon-free-explanation-of-how-ai-large-language-models-work/.) ↩︎
-
The predictive function, based on patterns may occur, is already popular; seen in some computer applications, for example, anticipating text selections in emails and Netflix user recommendations. The presciptive function uses data to predict what will happen based on past patttern in data.(Wikipedia. “Artificial Intelligence.” Reference, October 18, 2023. https://en.wikipedia.org/w/index.php?title=Artificial_intelligence&oldid=1180679786#Applications.) ↩︎
-
Within machine learning are four approaches to learning algorithms: supervised or directly trained by a user, semi-supervised where there is a mix of both labeled and unlabeled data where the model must find patterns to structure the data, unsupervised where the algorithm detects patterns on its own, and reinforcement where algorithm learns when success is achieved.(Gavrilova, Yulia. “AI vs. ML vs. DL: What’s the Difference.” Blog. Serokell Software Development Company, April 8, 2020. https://serokell.io/blog/ai-ml-dl-difference.) ↩︎
-
Based on my experiences during my internship and studies at George Washington University Museum Studies Graduate program. ↩︎
-
The Python.org website features instruction and informational pages for AI. ↩︎
-
As mentioned earlier in the paper, AI can transcribe written texts and used for AI modeling. ↩︎
-
The 4 W’s: The Who, What, Where, and Why. ↩︎
-
There is the learning based approach where the AI is trained using one of three learning methods: supervised, semi-supervised, unsupervised, and reinforcement. To learn more, read, “Click AI Project Lifecycle,” ↩︎
Bibliography
- Agrawal, Ajay, Joshua S. Gans, and Avi Goldfarb. “Artificial Intelligence: The Ambiguous Labor Market Impact of Automating Prediction.” Journal of Economic Perspectives 33, no. 2 (May 1, 2019): 31-50. https://doi.org/10.1257/jep.33.2.31.
- “AI Values Statement.” AI Values Statement | Smithsonian Data Science Lab, 2022. https://datascience.si.edu/ai-values-statement.
- Angwin, Julia, Jeff Larson, Lauren Kirchner, and Surya Mattu. “Machine Bias.” ProPublica. Accessed November 14, 2023. https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing.
- “Art Tracks | Art Tracks.” Accessed September 6, 2023. http://www.museumprovenance.org/.
- Biddle, Sam. “The Internet’s New Favorite AI Proposes Torturing Iranians and Surveilling Mosques.” The Intercept, December 8, 2022. https://theintercept.com/2022/12/08/openai-chatgpt-ai-bias-ethics/.
- Brannon, Isabella, Allen Su, Caitlyn Vergara, and John Villasenor. “AI & Bias - When Algorithms Don’t Work,” YouTube video. 7:13. September 14, 2022. https://www.youtube.com/watch?v=FD-4yC95iZY.
- Brown, Sara. “Machine Learning, Explained | MIT Sloan.” Education. MIT Management Sloan School, April 21, 2021. https://mitsloan.mit.edu/ideas-made-to-matter/machine-learning-explained.
- “Build Software Better, Together.” n.d. GitHub. Accessed December 2, 2023. https://github.com.
- Christou, Prokopis. “How to Use Artificial Intelligence (AI) as a Resource, Methodological and Analysis Tool in Qualitative Research?” The Qualitative Report, July 31, 2023. https://doi.org/10.46743/2160-3715/2023.6406.
- “Collecting & Provenance Research (Getty Research Institute).” Accessed November 30, 2023. https://www.getty.edu/research/tools/provenance/.
- Coursera. “Artificial Intelligence (AI) Terms: A to Z Glossary,” June 15, 2023. https://www.coursera.org/articles/ai-terms.
- Dikow, Rebecca B, Corey DiPietro, Michael G Trizna, Hanna BredenbeckCorp, Madeline G Bursell, Jenna T B Ekwealor, Richard G J Hodel, et al. “Developing Responsible AI Practices at the Smithsonian Institution.” Research Ideas and Outcomes 9 (October 25, 2023): e113334. https://doi.org/10.3897/rio.9.e113334.
- Gavrilova, Yulia. “AI vs. ML vs. DL: What’s the Difference.” Blog. Serokell Software Development Company, April 8, 2020. https://serokell.io/blog/ai-ml-dl-difference.
- Khowaja, Sunder Ali, Parus Khuwaja, and Kapal Dev. “ChatGPT Needs SPADE (Sustainability, Privacy, Digital Divide, and Ethics) Evaluation: A Review.” arXiv, April 13, 2023. http://arxiv.org/abs/2305.03123.
- Lee, Timothy B., and Sean Trott. “A Jargon-Free Explanation of How AI Large Language Models Work.” Ars Technica, July 31, 2023. https://arstechnica.com/science/2023/07/a-jargon-free-explanation-of-how-ai-large-language-models-work/.
- Lewis, Jason Edward, Angie Abdilla, Noelani Arista, Kaipulaumakaniolono Baker, Scott Benesiinaabandan, Michelle Brown, Melanie Cheung, et al. “Indigenous Protocol and Artificial Intelligence Position Paper.” Monograph. Honolulu, HI: Indigenous Protocol and Artificial Intelligence Working Group and the Canadian Institute for Advanced Research, 2020. https://doi.org/10.11573/spectrum.library.concordia.ca.00986506.
- Maddula, Surya. “The AI Project Cycle.” Medium (blog), November 4, 2021. https://suryamaddula.medium.com/the-ai-project-cycle-e363ce3f4f6f.
- Malaro, Marie C., and Ildiko Pogany DeAngelis. A Legal Primer on Managing Museum Collections. Third. Washington: Smithsonian Books, 2012. Kindle.
- Matsakis, Louise. “Artificial Intelligence May Not ‘Hallucinate’ After All.” Wired. Accessed November 6, 2023. https://www.wired.com/story/adversarial-examples-ai-may-not-hallucinate/.
- Murphy, Oonagh, and Elena Villaespesa. The Museums + AI Network AI: A Museum Planning Toolkit. Goldsmiths, January 2020. https://themuseumsainetwork.files.wordpress.com/2020/02/20190317_museums-and-ai-toolkit_rl_web.pdf
- NIST AIRC. “NIST AIRC - AI RMF Core.” Accessed October 19, 2023. https://airc.nist.gov/AI_RMF_Knowledge_Base/AI_RMF/Core_And_Profiles/5-sec-core.
- OpenAI. “GPT-4 System Card,” March 23, 2023.
- OpenAI Platform.” n.d. Accessed December 2, 2023. https://platform.openai.com.
- Pasick, Adam. “Artificial Intelligence Glossary: Neural Networks and Other Terms Explained.” The New York Times, March 27, 2023, sec. Technology. https://www.nytimes.com/article/ai-artificial-intelligence-glossary.html.
- Reagan, Mary. “Understanding Bias and Fairness in AI Systems.” Medium, April 2, 2021. https://towardsdatascience.com/understanding-bias-and-fairness-in-ai-systems-6f7fbfe267f3.
- Stapleton, Andy. “How To Write An A+ Essay Using AI in 3 Simple Steps.” YouTube video. 8:08. October 30, 2023. https://www.youtube.com/watch?v=EeMm-kaYgI0.
- Takyar, Akash. “Data Security in AI Systems.” LeewayHertz - AI Development Company, June 19, 2023. https://www.leewayhertz.com/data-security-in-ai-systems/.
- TDS Editors. “Why Eliminating Bias in AI Systems Is So Hard.” Medium, October 28, 2021. https://towardsdatascience.com/why-eliminating-bias-in-ai-systems-is-so-hard-97e4f60ffe93.
- The Chicago Manual of Style Online. “The Chicago Manual of Style, 17th Edition.” Accessed November 14, 2023. https://www.chicagomanualofstyle.org/.
- “The Evolution of Document Scanning - How Can AI Help You? | ITS Group.” Accessed November 25, 2023. https://www.its-group.com/news/story/the-evolution-of-document-scanning-how-can-i-help-you.
- “The History of Artificial Intelligence (4K) | CyberWork And The American Dreams | Spark.” YouTube video. 55:37. July 13, 2022. https://www.youtube.com/watch?v=q6U9mhKAFFA.
- Villaespesa, Elena, and Ariana French. “AI, Visitor Experience, and Museum Operations: A Closer Look at the Possible.” 101-13, 2019.
- Weisberg, Robert J. “Is AI the Right or Wrong Solution to the Right or Wrong Problem?” Museum Human, October 24, 2023. https://www.museumhuman.com/is-ai-the-right-or-wrong-solution-to-the-right-or-wrong-problem/.
- Weise, Karen, and Cade Metz. “When A.I. Chatbots Hallucinate.” The New York Times, May 1, 2023, sec. Business. https://www.nytimes.com/2023/05/01/business/ai-chatbots-hallucination.html.
- Wikipedia. “Artificial Intelligence.” Reference, October 18, 2023. https://en.wikipedia.org/w/index.php?title=Artificial_intelligence&oldid=1180679786#Applications.