Could This Be Written by a Robot in a Chinese Room?
ChatGPT and AV Design — Part the First
ChatGPT has been a major topic both in the world and in our corner of life in AV. How, exactly, does it fit into our world? Why don’t we start by asking the thing itself? Then we’ll talk among ourselves, while the robot isn’t listening to us (or is it?).
“How can ChatGPT assist with audiovisual design?
While ChatGPT is primarily a text-based language model, it can still assist with audiovisual system design in a few ways:
- Consultation and Knowledge Sharing: ChatGPT can provide information and answer questions related to audiovisual system design. Designers or integrators can engage with the model to seek advice on specific components, industry standards, best practices, or troubleshooting techniques.
- System Recommendations: By providing details about the project requirements and constraints, ChatGPT can offer suggestions for audiovisual equipment, such as speakers, displays, projectors, or control systems. While it cannot replace the expertise of a professional audiovisual designer, it can provide insights and recommendations based on the information provided.
- Troubleshooting and Problem Solving: In case of audiovisual system issues or challenges, ChatGPT can be used to diagnose potential problems and provide troubleshooting guidance. By describing the symptoms or providing context, the model can help identify common issues and propose potential solutions or troubleshooting steps.
- User Interface Design: ChatGPT can assist in designing user interfaces and control systems for audiovisual setups. By discussing the desired functionality and interaction requirements, the model can provide suggestions for intuitive user interfaces, control layouts, or integration techniques to enhance the user experience.
- Research and Education: ChatGPT can help designers and integrators access relevant research papers, industry guidelines, or educational resources related to audiovisual system design. It can provide summaries or explanations of complex concepts, technical specifications, or emerging trends in the field.
It is important to note that while ChatGPT can provide valuable information and insights, it should be used as a supplementary tool alongside human expertise and professional judgment. As an AI model, it may not always have access to the most up-to-date information, and its responses are based on patterns learned from training data rather than real-time experience. Therefore, it is advisable to verify and cross-reference any recommendations or advice from ChatGPT with reliable sources and consult with audiovisual professionals for critical design decisions.”
So goes our first, simple answer from ChatGPT on ChatGPT and audiovisual design.
For anyone who doesn’t know, ChatGPT is a general-purpose chatbot developed by OpenAI and released late last year. Its most impressive features are the ability to carry on a conversation using natural language and an impressive knowledge base gleaned from months of scraping various internet sites for both data and conversational style. It knows what it knows and can write what it writes because of all that it’s learned. This isn’t only reflecting knowledge, but it makes assumptions based on patterns it’s seen in the data. It knows enough to be able to create content based on patterns it has found in the vast online world which serves as its teaching set. That is both its biggest strength and greatest weakness.
The first thing we notice is the comfort in conversing with it; there’s no hunting for the magic words that will lead the computer to give you an answer. You ask it a question in simple English, and it answers you. Let’s ask it to design a 12-person Zoom Room.
“For a 12-person Zoom room, here are some system recommendations to ensure a productive and immersive audiovisual experience:
- Display: Choose a large display or multiple displays to ensure clear visibility for all participants. A 55 to 75-inch LED or LCD display with high resolution (such as 4K) would be suitable.
- Camera: Select a high-quality PTZ (Pan-Tilt-Zoom) camera capable of capturing the entire room with smooth and accurate movements. Look for features like wide-angle lenses, auto-focus, and optical zoom to accommodate various seating arrangements.
- Audio: Opt for a professional-grade audio setup to ensure clear and balanced sound for all participants. Consider a combination of ceiling or table-mounted microphones to capture audio from different areas of the room, along with a sound processing unit or DSP for echo cancellation and noise reduction.
- Speakers: Install ceiling-mounted or wall-mounted speakers strategically to ensure even audio distribution throughout the room. This will help participants hear remote participants clearly without distortion or echo.
- Control System: Implement a user-friendly control system that integrates all the audiovisual components and simplifies the operation of the Zoom room. This can include a touch panel or a user interface with controls for video switching, audio volume, camera presets, and meeting controls.
- Lighting: Proper lighting is essential for video conferences. Consider using adjustable LED lighting fixtures to ensure participants are well-lit and free from distracting shadows. Incorporating dimming controls or automated lighting presets can provide flexibility for different meeting scenarios.
- Connectivity: Ensure robust internet connectivity for seamless video conferencing. Use a wired network connection or a reliable Wi-Fi setup to support high-definition video streaming and screen sharing.
- Furniture and Layout: Choose comfortable and ergonomic furniture that accommodates all participants. Arrange the seating and display setup in a way that allows everyone to have a clear view of the screen and the camera captures all participants without obstruction.
- Acoustic Treatment: Consider implementing acoustic treatments such as sound-absorbing panels or ceiling baffles to minimize echoes and improve audio quality within the room.
- Room Design: Create a professional and aesthetically pleasing environment by selecting appropriate décor, wall colors, and branding elements that align with the organization’s style and image.
Remember, these recommendations may vary based on the specific requirements, budget, and available space for your Zoom room. It is advisable to consult with professional audiovisual integrators or designers who can provide customized solutions tailored to your needs.”
Note how ChatGPT magnifies some of the worst habits we have in our industry, specifically in suggesting a “high-quality” camera without a definition of what that means. Note also that the above, while generic, can provide a basic framework for a discussion — especially in the hands of a layperson who doesn’t know what to ask. The above is one place where this technology is useful today. It can answer a question or give a basic overview of different technologies. The above is interesting, but is it useful? I can see this being a tool to create a first draft of a pre-project question sheet if one isn’t already available, to rough-draft executive summaries, or give a framework for an expert to fill out later.
What if we’re comparing direct-view LED walls to tiled LCDs, for example? It’s easy enough to ask, and even get what appears to be a reasonable answer.
“The main difference between a direct-view LED wall and tiled LCDs lies in the technology used to create the display and the characteristics of the resulting image.
- Technology: Direct-view LED wall utilizes individual light-emitting diodes (LEDs) as the building blocks of the display. These LEDs emit light directly, and when combined together, they form the entire screen. On the other hand, tiled LCDs (Liquid Crystal Displays) are composed of multiple individual LCD panels that are arranged in a grid-like fashion to create a larger display.
- Image Quality: Direct-view LED walls generally offer superior image quality compared to tiled LCDs. LED walls typically provide higher brightness levels, wider color gamut, and better contrast ratios. This makes them well-suited for environments with challenging lighting conditions, such as brightly lit rooms or outdoor spaces. LED walls also tend to have better viewing angles, ensuring that the image quality remains consistent even when viewed from different angles.
- Seamless Display: Direct-view LED walls are known for their seamless display. Since each LED serves as a single pixel, there are no visible bezels or gaps between the individual LEDs. This creates a uniform and uninterrupted viewing experience. In contrast, tiled LCDs have narrow bezels between the LCD panels, which can create thin black lines or gaps on the screen. Although the bezels on tiled LCDs have become thinner over time, they can still be noticeable when viewing content up close.
- Scalability: Direct-view LED walls are highly scalable and can be easily customized to fit different sizes and shapes. LED panels can be combined to create large-scale displays of virtually any size, making them suitable for applications such as massive video walls or outdoor signage. Tiled LCDs, on the other hand, are limited in scalability as they are composed of fixed-size LCD panels. Expanding the display area of a tiled LCD system requires adding more individual LCD panels, which can result in visible seams and increased complexity.
- Power Consumption: Tiled LCDs typically consume less power compared to direct-view LED walls. LED walls require a significant amount of energy to power the individual LEDs, especially for larger displays. Tiled LCDs, although more power-efficient, still require backlighting for the LCD panels.
In summary, direct-view LED walls offer superior image quality, seamless display, scalability, and are well-suited for challenging lighting conditions. Tiled LCDs, while generally more power-efficient, have visible bezels between panels and are limited in scalability. The choice between the two technologies depends on the specific requirements of the application, budget, viewing environment, and desired image quality.”
Note that, while budget was mentioned, it did NOT tell us how much lower-cost LCD panels are than a similarly sized LED wall. The other items give an accurate and reasonable overview. This brings us to another present-day use-case for ChatGPT and similar, even for a professional: It can quickly create an overview and a framework for a more in-depth report, essay or similar written by a professional with expertise.
Why use it as a framework and not just accept a machine-generated report? Several reasons. The first is the surprising weakness in the current crop of AI chatbots — one for which decades of science fiction have us ill-prepared. ChatGPT can write natural prose that even feels creative, but it is surprisingly weak at knowing what is and is not factual data. Because it works by recognizing patterns and creating content that matches those patterns it is very good at certain kinds of tasks. What can it do well? I’ve seen someone use it to create complete plotlines for a hypothetical new “Star Trek” show — with a result that didn’t seem appreciably worse than quite a bit of human-generated Star Trek. A particular strength is creating what Trek fans refer to as “technobabble” because generating plausible-sounding phrases that sound like something we’ve seen before is one of the tool’s strengths.
Where does this pattern-matching reveal itself to be a weakness? It’s so intent on finding patterns that it will create them. In one noteworthy recent example, a lawyer used ChatGPT to create a legal brief. It did so and found several relevant court decisions to back its position. The problem? Those court cases ChatGPT referenced did not exist. Its skill at creating content that fits existing patterns worked too well — it came up with something that sounded very much like an actual court case, yet wasn’t.
This flies in the face of everything we’ve learned from science fiction; the robot character — always — has a commanding knowledge of fact coupled with a tendency toward literalism. ChatGPT can hallucinate, but it can also write fiction. As part of my experiment, I asked it to write a story about a fisherman living in the desert; it gave me something with a vaguely Middle Eastern feel. With one sentence — “set it in the American Southwest” — I was given a variation on the theme. Ali became Carlos. Mesquite trees appeared. The language subtly shifted to a more American western feel. It felt like a magic trick.
And that leads us to the bottom line — ChatGPT looks as if it thinks, but it does not think. Sometimes this matters, sometimes it doesn’t. It reminds me most of all of the “Chinese Room” thought experiment. For those who don’t know, this posits a person locked inside of a room and sent a series of questions in Chinese from someone outside. They are given a rulebook in their native explaining which Chinese symbols create answers to which questions. The rules don’t explain the meaning of the Chinese characters, just give a series of algorithms to answer the questions. I asked a question, I got an answer. Am I connecting to another mind?
From the perspective of someone outside the room, the subject is speaking Chinese; as one of us logged into our ChatGPT account they give meaningful, reasonable answers to whatever questions are posed. Inside the room, however, there is no mind as we know it. The subject is checking a rulebook, and relaying back Chinese answers without understanding them.
ChatGPT is, at its best, a Chinese room. It is certainly more sophisticated than the purely algorithmic rulebook in the original thought experiment it doesn’t have reason or initiative. If you ask it about LED vs tiled LCD walls it won’t know to follow up by asking about content. It doesn’t know what it’s saying, doesn’t think of context, cannot do what I’ve talked about all these years ago here and ask the next question.
So yes, it can create a line of plausible-sounding faux-technical dialog as well as a plausible-sounding legal citation. The problem is that, absent a mind, it doesn’t know that we need the legal citation to be literally real and don’t care if the science-fiction text is merely plausible.
The risks to us are two-fold: first, disruption of current industries. What happens when a room full of sitcom writers can be replaced by an editor and a robot? What happens when we use it to draft technical narratives (look for more on that in the future!). Now we are all Ned Ludd, faced with the power loom. Yes, there’s a temptation to smash it. Generations later Ludd is only remembered as anti-technology, while machine-created textiles are the norm. He lost, and those of us trying to sweep back the tide will lose as well.
The second risk is the one inherent with the Chinese room; if you squint just a bit, from the outside it certainly can SEEM like a mind. It isn’t. We need to remember that, and be sure we aren’t giving answers which are reprocessed conventional wisdom, machine hallucinations and just plain mistakes. We need to know what the tool doesn’t know so we can use it mindfully and thoughtfully.
Let this serve as an introduction to my thoughts on the topic. Watch this space for more.