Even though artificial intelligence can conceptually trace its roots back to the 1950s, the reality is AI and Machine learning went mainstream not too long ago and as recently as November of 2022, anyone with a computer and internet access can dip their toes in the AI waters for not much more than the time needed to understand how to make it work. Unless you’ve had your head in the sand, you probably know this is a big deal. But what does this mean to the profession of architecture? Are we all going to be replaced, are there any ethical considerations that we should start thinking about? … Welcome to EP 122: Architecture and Artificial Intelligence
[Note: If you are reading this via email, click here to access the on-site audio player]
Today we are going to start the process of discussing artificial intelligence and how it is intersecting with the practice of architecture, as well as the education process of architecture students. That is an Everest-sized task but we have to start somewhere. Andrew and I have both been exploring platforms like ChatGPT as well as image generators like MidJourney, DALL-E, and others. There is so much information out there that we felt like we should bring in a guest for today’s show that can help facilitate our conversation.
Introduction jump to 1:33
Kory Bieg is an Associate Professor of Architecture at The University of Texas at Austin. He received his Master of Architecture from Columbia University, his Bachelor of Arts in Architecture from Washington University in St. Louis, and is NCARB certified and a registered architect in the State of Texas.
In 2005, Kory Bieg founded OTA+, an architecture, design, and research office that specializes in the development and use of current, new, and emerging digital technologies for the design and fabrication of buildings, building components, and experimental installations. OTA+ uses current design software and CNC machine tools to both generate and construct conceptually rigorous and formally unique design proposals.
Since 2013, he has served as Chair of the Texas Society of Architects Emerging Design + Technology conference and is co-Director of TEX-FAB Digital Fabrication Alliance. He has served on the Board of SXSW Eco Place by Design and the Association for Computer-Aided Design in Architecture, otherwise known as ACADIA.
Artificial Intelligence 101 jump to 5:58
Kory provided us with a very brief history of Artificial Intelligence. So initially, it was the goal to make the machine/computer think like humans. The Turing Test (invented in 1950) was meant to test the ability of a machine to “fool” a human. The goal was that if a person who was interacting with a machine did not know they were interacting with a machine, then the machine had passed the Turing Test. It was originally referred to as the imitation game. This was developed at a time when the main goal for machine learning was to have a machine think and behave like a human being. In more recent times, we have realized that this is not really a possibility. We have shifted our goals to make machines think like machines but train them to do the things they can do better than humans, like analyze big chunks of data, run complicated algorithms, etc., etc. One of the first images that Kory ever saw was from Matias del Campo (his instagram) and it was a house made of feathers.
There are several types of machine learning models that consist of machine learning from certain data sets. The information that is provided to the machine is the only information that it learns. For example, the self-driving car AI machines are only fed data sets that deal with driving, like road conditions, mapping, traffic laws, driving situations, etc. In this scenario, they can learn very specific sets of data and then solve the complex problems create or related to only those data sets.
The most current version of this data scraping is now fusion-based AI. These are the text-to-image types of Artificial Intelligence platforms. It is scraping the internet sources for the data/images that begin to inform a noise cloud of random pixels of information. It is not doing this process by collage, it is not stitching the found images together, and it is assembling the information pixel by pixel to create all new images. These text-based images are created by the user typing out a “prompt” that works as the descriptor of this image creation. So then, the AI starts to create the images based on the description prompts that have been entered. There are many of these available at the moment, as the platforms are just hitting the scene. Time will tell which ones of these survive. You can think about it like the beginning of social media platforms a few years ago.
The prompt is a major factor that impacts the image that is created. But as the platforms have progressed, Kory is finding it more about the curation of the iterations of the images that come from one original prompt. These images can be remixed and iterated, then taken into another AI platform to continue and refine them. So there is a workflow of sorts that can be used to craft the final AI images that Kory produces. It can happen that a very short prompt can provide just as complex and wonderful an image as a very long and descriptive prompt.
What Platforms Should People be Paying Attention to? jump to 16:27
Midjourney is probably the most well-known at the moment. But there are several platforms that might be most “useful” to architects and designers. Again it should be noted that there are a plethora of these in the market at the moment. But some of them seem to rise to the forefront for us as architects and designers. One of them is MidJourney, which uses a Discord channel that is the forerunner for image creation. This one is more creative in its generation, and you can even modify the level of creativity.
Another is Dall-E, created by Open AI, which is now on its second version Dall-E 2. This one is a bit more straightforward in the interpretation of the prompt. It will more directly interpret the text prompt you to create and input. So this may not produce an image that is as creatively “loose” as those from Midjouryey.
Stable Diffusion (Dream Studio) is another that more directly interprets the text prompts from the user. However, it can still be useful to craft certain imagery. It also has some “sub-components,” such as Control Net. This AI allows the blending or mixing of two pictures in an exact method that then can be embellished in multiple ways. This is more of a direct image-to-image layout transfer. For example, one specific pose in the source image is kept exactly the same in the newly AI created image.
Hugging Face is an open course platform that contains all types of AI modifications and scripts. It may be a bit more advanced, but it can do some very specific tasks. For example, it has a script that you can insert an AI-produced image, and it will give you the prompt that AI would use to create that image. This can produce some wild results of text prompts, but Kory has used them to analyze his own work, and the prompts worked quite well. In one instance, his architectural imagery gave references to things like Selena Gomez or the Texas Revolution.
Chat GPT is a text-based AI; officially, it is an artificial intelligence chatbot. It uses the full internet as a data set and provides text mainly. This one is really having a large impact in all areas. While the platforms above are mainly for creative imager creation, Chat GPT produces text that can be used in any circumstance. It can write reports, edit texts, write code, solve math, create ad copy, draft a bibliography on a certain topic, and so much more. It has shown up in many areas of life in the past few months. It was released on November 30, 2022, and just recently released Chat GPT 4. So these platforms are all updating at a fast pace and continue to increase capabilities.
The Impact on the Education Process jump to 19:45
These platforms will have a definite impact on the education of future architects. Kory is using the text-to-image platforms as a sketch tool. He recently had his students use Hugging Face to let AI read precedent images to analyze existing work. They used AI to look at massing, circulations, and other typical elements and the descriptions of those images. The students used that feedback to then create their own ideas using the data from the AI. This allowed them to understand an architect’s work in a new and specific way and then take that information to then generate their own ideas with that information. They then looked to move from AI to attempt to create the imagery from AI in more traditional means with current software like Rhino and Grasshopper. This is not an easy task as the imagery from AI is so richly complex and complicated in detail within its two-dimensional creations.
Kory believes that within a year or so, these platforms will move from 2D images into 3D output. This is somewhat scary but also exciting. This may allow more interaction with the iterations and allow designers to place them into optimization-based analysis. or daylighting models, or generative models. This allows us as architects to analyze these in a real context. Right now, they are all totally without context. Being able to export these into a real context will be a game changer for the profession. This will change the way we can work as architects and designers and impact our workflows. Our processes will be impacted in some new and interesting ways.
Chat GPT is also shaking up the education system. This text-based platform is impacting the educational system well beyond just architecture. This one can do some interesting things and has many in academia very concerned about the way it will impact education. At the moment, most of academia is trying to find ways to manage this platform. But one big issue with this platform is that it is using the internet as the data set, and not everything on the internet is factual or true. So there is a possibility the response that it provides is incorrect. There have been several instances of Chat GPT providing false citations for references that do not exist.
The two main issues with these new AI platforms are that they use the internet for the data set. The first is the one mentioned above; not all information on the web is factual or correct. The second issue is the manner in which the data sets are “tagged” internally by the AI or the data set provider. So this creates issues for many professions or users as there are so many different identifiers for the same item. For example, architects tend to call doors and windows by other names, like openings or even apertures, but the AI does not identify those as that tag. So there are many of the specific identifiers are not represented in the current system. Also, there can be some bias implied by the data set providers as they tag the data. This is where human bias can become involved.
Our conversation began to go down several rabbit holes at this point. We discuss the characteristics of the prompt. Kory reinforces his idea about the iterations are more important than the actual prompt. But also the usage of prompts is also changing. We touch on the translation of the imagery into actual projects. How will this change when AI is producing three-dimensional models? While this may be a bit further down the road when those models may actually be useful, it is coming quickly. The ideas and consequences of utilizing the internet as the data set behind these platforms and its bearing on the outcome. Those with more “bandwidth” on the internet will be more abundantly represented in the AI responses. This may not always produce the “best of” results because an abundance on the web does not always equate to good.
Future of AI and How it Pertains to Architects jump to 39:42
Up to this point, most architecture is based on our ability to document it. As this moves forward, it will begin to impact this condition. The ability to take these more “fantastical” concepts and translate them into three-dimensional models will make the workflow of architecture change. Now, this may be further down the line, but it is a definite possibility.
Kory thinks that there might be two paths this could take. It has been stated that we have been through this before in several instances. This type of destabilization, fear, and excitement happened when computers were introduced, when VR was introduced, and when grasshopper was introduced. These disruptions have not truly impacted the profession as they were initially expected to do in a dramatic fashion. While they had an impact, they did not completely transform the profession. So these AI elements may not pan out to impact in such a large way. They may develop in their own parallel paths. The other possibility is that they ALL converge at some point the in the future. This is when they might greatly transform the profession. But at this point, we have no idea as to how this new AI technology will ultimately develop.
Currently, there is a great deal of exploration happening as to how this will impact the future of the profession. At this point, it is all very experimental and is in its infancy. How will it continue to grow is the big unknown. It might make it possible for the profession to use AI to replace some of the more mundane tasks that we do, such as detailing, construction documents, specifications, and the like. Can AI replace those parts of the process and, by proxy, allow architects to concentrate on other portions of the process, such as design, performance, and more qualitative portions of the design process?
For everyone, the huge question is how it will pan out. It will develop without question. Kory wants to view this all from an optimistic viewpoint. It could make it better for us in many ways. These machine platforms will never be able to replace all of us. They are not human. They will do things better than humans but also things they will never be able to do as well as a human. So there is a space to be positive about all of this new technology. All these new technologies have so many possibilities for the profession and the educational system. Kory believes the best way forward is to embrace it and make it work for you.
What the Rank jump to 56:11
Well, we were able to convince Kory to join in on the What’s the Rank for this week. We thought we would also make it an easy one for our guest so we, of course, went with a food ranking. We made some overlapping selections but still managed to keep it varied.
Today we are ranking [drum roll please] ….
The Best Three Types of Cookies
|Andrew’s Best Cookies
|White Macadamia Chocolate Chip
|Soft and Salted Chocolate Chip
|Bob’s Best Cookies
|Kory’s Best Cookies
|Frozen Thin Mint
|Thin crispy salty chocolate chip
|Mom’s Christmas Sugar cookie
Although Bob admitted he did not like cookies as a rule, but also, in the same chat admitted to eating a full sleeve of Kory’s number three pick of Thin Mints in a single setting. Girl Scout cookies took two spots which was not really a surprise. Then the illustrious Oreo made the list for two of us. Who didn’t see that coming? But then Bob managed to make fun of Kory and me because the chocolate chip cookie was such a basic choice. He proposed it was such an obvious choice, and so he was against it to some degree. Yet he managed to pick the blandest of all cookies in existence, the Nilla Wafer.
EP 122: Architecture and Artificial Intelligence
Artificial intelligence is here and will continue to evolve. Don’t be afraid to explore. The more you put in, the more you get in return. Kory is a strong supporter of this dialog. It is a continual learning process, and we, as architects, should always be learning. This is part of the process. It can be forced upon you with its framework already in place, or you can explore it in these early stages and allow it to do the work you want it to do for you. It will never replace our abilities as humans in the design process, but it seems that it will definitely change our process in the future.
ps – sometimes the images that come from the prompt are completely non-sensical … check these out and read the caption for what was entered to generate them …