LLM watermarks are still fragile

Use case: the customer experience journey

Posted on April 5th, 2024

Summary

This week saw more articles appear on the theme of disinformation. On the technical side, a Technology Review article reiterates the results of an ETH Zurich paper which shows that watermarks for LLM-generated contents are not robust. This means that watermarks on textual content can be removed and forged. Another article reviews the Threads social media platform. An interesting point made in the article is the suggestion that people have moved to this platform because of its low level of content moderation.

Two papers from MIT Technology Review present interviews with experts in the customer experience domain about how AI is having an impact there. The business objective behind AI integration is to improve customer management processes since experts believe that 80% of customers remain with a brand based on their customer experience. AI is chiefly being used for sentiment analysis and to help in specific tasks, like note-taking.

Two recent research papers on arxiv.org present surveys. A paper from Drexel University surveys the impact of LLMs on security and privacy. The paper presents vulnerabilities, attack scenarios but also cases where LLMs improve defense. A paper from Indian researchers review the use of LLMs to improve the Infrastructure-as-code frameworks, where the goal to reduce manual intervention in the whole software DevOps pipeline.

1. Hugging Face partners with Wiz Research to Improve AI Security

Hugging Face has teamed up with Wiz to enhance security within its platform and the broader AI/ML community. Wiz specializes in cloud security, aiding clients in developing and maintaining secure software practices. The collaboration focuses on Cloud Security Posture Management (CSPM). Hugging Face uses multiple Kubernetes clusters across several regions and cloud providers, so a centralized security management is helpful.

The article underscores the risks associated with Pickle files, a widely used serialization format in ML and the default for PyTorch model weights. Loading a pickle file can expose systems to code injection attacks. In response, Hugging Face is exploring Safetensors, a new serialization format for weights designed to be more secure.

Security measures implemented by Hugging Face include Picklescan, developed in partnership with Microsoft, Safetensors, a secure alternative to pickle files, a bug bounty program, malware scanning, secret scanning to detect hardcoded passwords or API keys in code, and accepting models only from trusted sources.

2. Why Threads is suddenly popular in Taiwan

Threads, Meta's text-based social network, has found significant popularity in Taiwan, especially among young people. Threads has consistently dominated app-store download charts in Taiwan and has attracted prominent officials to set up accounts. Threads was introduced as Meta's response to Twitter, particularly after Elon Musk's acquisition of the latter, prompting users to seek alternative platforms. Moreover, issues like bots, misinformation, disinformation, and spam led users to search for new alternatives. Threads has minimal political moderation efforts, making it attractive to those seeking open political discourse. The article does not mention how the platform addresses bots and disinformation, and a recent Guardian article cites concerns over Meta’s privacy practices around the platform.

3. Purpose-built AI builds better customer experiences

In the past, contact centers relied solely on phone calls to manage customer interactions. However, today's contact centers have expanded to include multiple channels such as email, social media, and chatbots. Sentiment analysis helps contact centers identify calls that require escalation or further support in real-time. Additionally, AI tools can summarize calls and automate note-taking, allowing agents to focus more on other customer needs. Instead of traditional IVR (Internal Voice Recording) systems, artificially intelligent routing can predict why a customer is calling and direct them to the most appropriate agent. Data and AI are used to measure customer sentiment during interactions and summarize calls afterward. Contact centers typically set goals around measuring customer experience, including metrics such as CSAT (Customer Satisfaction Score), sentiment analysis, first call resolution, average handle time, digital resolution rate, and digital containment rate (metric that assesses a chatbot's ability to address user queries and issues without the need for human intervention).

4. Scaling customer experiences with data and AI

This article interviews an expert in customer service experience management about the use of AI in this field. According to the expert, once a consumer makes a decision to buy, 80% of their decision to continue doing business with a brand depends on the quality of their customer service experience. Emerging AI applications are revolutionizing efficiency in customer service, including sentiment analysis, co-pilots that aid employees in resolving customer issues, and integration tools for the whole of the customer journey. As an example of the latter, instead of customers initiating contact with a chatbot or contact center, AI tools can proactively identify issues with a customer's device in advance. This early detection allows the AI system to redirect the customer to a live chat with a representative. An example of tools for employees is note-taking. The expert claims that spending 30 to 60 seconds on note-taking for each of 1,000 employees can cost millions of dollars annually. The recommendation is to prioritize clear, high-probability use cases for efficiency gains in specific tasks. Finally, AI and machine learning play a crucial role in identifying areas of variability within business processes – something often seen as a goal for large businesses, and for which Starbucks is considered a typical example. Maintaining consistency across global operations significantly contributes to the overall customer experience and brand reputation.

5. It’s easy to tamper with watermarks from AI-generated text

Watermarking involves embedding hidden patterns in AI-generated text, enabling computers to recognize that the text originates from an AI system. With the European Union's AI Act coming into effect in May, developers will be mandated to watermark AI-generated content. However, a paper from ETH Zürich scrutinized five different watermarking methods, revealing that they were susceptible to attacks, achieving over 80% success rate. These attacks can be categorized into two types: spoofing attacks and watermark removal attacks. Spoofing attacks enable malicious actors to exploit stolen watermark information to produce text that appears to be watermarked. Watermark removal attacks allow hackers to erase watermarks from AI-generated text, making it indistinguishable from human-written content.

6. A conversation with OpenAI’s first artist in residence

Alex Reben, from OpenAI, is interested in the intersection of AI and art. He notes that traditional roles like artists painting products for advertisements, such as cans of peaches for magazines or billboards, have largely disappeared. This shift is attributed to the advent of photography, which made capturing images more accessible to everyone. However, Reben highlights that fine-art photography still requires a significant level of skill, and that photography has influenced subsequent art movements. Rather than using AI to replicate photographic reality, artists might emerge that create new forms of representation – equivalent to movements like Impressionism and Cubism.

7. A survey on Large Language Model (LLM) security and privacy: The Good, The Bad, and The Ugly

This paper, which will appear in the High Confidence Computing journal, presents a survey of Large Language Models (LLMs) in relation to their security and privacy, classified into positive impacts, negative impacts, and vulnerabilities with corresponding defenses. It covers the beneficial uses of LLMs in enhancing code security (e.g., through test case generation, vulnerability identification), data privacy (by identifying personal data), and detecting and mitigating network security threats. It addresses the potential for LLMs to be exploited in executing various cyberattacks, including hardware-level, OS-level, software-level, network-level, and particularly user-level attacks which deal with social engineering and over-reliance on content generated by LLMs. The document further delves into inherent vulnerabilities within LLMs, such as data poisoning, backdoor attacks (manipulating the model), and prompt injection. The paper also outlines defense strategies such as data cleaning, adversarial training, instruction pre-processing to remove malicious prompts, and post-processing to remove toxic outputs.

8. A Survey of using Large Language Models for Generating Infrastructure as Code

Infrastructure as code (IaC) manages and provisions IT infrastructure using machine-readable code by enabling automation, consistency across the environments, reproducibility, version control, error reduction and enhancement in scalability. Automation of IaC is a necessity and example frameworks include HashiCorp’s Terraform, Ansible, Puppet and Chef. However, IaC orchestration is often a painstaking effort which requires specialized skills as well as a lot of manual intervention. The paper explores the feasibility of using Large Language Models (LLMs) for IaC, where it can handle tasks such as generating configuration files for IaC tools as well as static analysis of instructions and of code being deployed. The challenges mentioned include the steep learning curve, over-reliance, security risks and maintaining an LLM-based infrastructure.