The researchers filled in one gap by borrowing ideas from a machine-learning technique called contrastive learning and applying them to image clustering. It includes everything from classification algorithms that can detect spam to the deep learning algorithms that power LLMs. After joining the Freeman the income statement Lab, Alshammari began studying clustering, a machine-learning technique that classifies images by learning to organize similar images into nearby clusters. The researchers didn’t set out to create a periodic table of machine learning.
Helping power-system planners prepare for an unknown future
Rather than falling into camps of techno-optimists and Luddites, people are discerning about the practical upshot of using AI, case by case. A new study finds that people are neither entirely enthusiastic nor totally averse to AI. We have the ability to think and dream in our heads, to come up with interesting ideas or plans, and I think generative AI is one of the tools that will empower agents to do that, as well,” Isola says. On the other side, Shah proposes that generative AI could empower artists, who could use generative tools to help them make creative content they might not otherwise have the means to produce. In addition, generative AI can inherit and proliferate biases that exist in training data, or amplify hate speech and false statements. The same way a generative model learns the dependencies of language, if it’s shown crystal structures instead, it can learn the relationships that make structures stable and realizable, he explains.
From physics to generative AI: An AI model for advanced pattern generation
These powerful machine-learning models draw on research and computational advances that go back more than 50 years. While all machine-learning models must be trained, one issue unique to generative AI is the rapid fluctuations in energy use that occur over different phases of the training process, Bashir explains. They also used I-Con to show how a data debiasing technique developed for contrastive learning could be used to boost the accuracy of clustering algorithms.
An influential 2015 paper on “algorithm aversion” found that people are less forgiving of AI-generated errors than of human errors, whereas a widely noted 2019 paper on “algorithm appreciation” found that people preferred advice from AI, compared to advice from humans. “There are differences in how these models work and how we think the human brain works, but I think there are also similarities. Generative AI chatbots are now being used in call centers to field questions from human customers, but this application underscores one potential red flag of implementing these models — worker displacement. As long as your data can be converted into this standard, token format, then in theory, you could apply these methods to generate new data that look similar. In 2014, a machine-learning architecture known as a generative adversarial network (GAN) was proposed by researchers at the University of Montreal.
New method simplifies the construction process for complex materials
One popular type of model, called a diffusion model, can create stunningly realistic images but is too slow and computationally intensive for many applications. Before the generative AI boom of the past few years, when people talked about AI, typically they were talking about machine-learning models that can learn to make a prediction based on data. The electricity demands of data centers are one major factor contributing to the environmental impacts of generative AI, since data centers are used to train and run the deep learning models behind popular tools like ChatGPT and DALL-E. While the explosive growth of this new technology has enabled rapid deployment of powerful models in many industries, the environmental consequences of this generative AI “gold rush” remain difficult to pin down, let alone mitigate.
It’s now upending educational models and, in some cases, complicating efforts to improve student outcomes. “Throughout my career, I’ve tried to be a person who researches education and technology and translates findings for people who work in the field,” says Reich. An efficient image-generation model would unlock a lot of possibilities,” he says. “LLMs are a good interface for all sorts of models, like multimodal models and models that can reason. It uses about 31 percent less computation than state-of-the-art models. An autoregressive model utilizes an autoencoder to compress raw image pixels into discrete tokens as well as reconstruct the image from predicted tokens.
Questions: Shaping the future of work in an age of AI
- The researchers didn’t set out to create a periodic table of machine learning.
- Before the generative AI boom of the past few years, when people talked about AI, typically they were talking about machine-learning models that can learn to make a prediction based on data.
- A data center is a temperature-controlled building that houses computing infrastructure, such as servers, data storage drives, and network equipment.
A user only needs to enter one natural language prompt into the HART interface to generate an image. But the generative artificial intelligence techniques increasingly being used to produce such images have drawbacks. You may not alter the images provided, other than to crop them to size. Even if the AI is trained on a wealth of data, people feel AI can’t grasp their personal situations. For example, people tend to favor AI when it comes to detecting fraud or sorting large datasets — areas where AI’s abilities exceed those of humans in speed and scale, and personalization is not required. The researchers tested whether the data supported their proposed “Capability–Personalization Framework” — the idea that in a given context, both the perceived capability of AI and the perceived necessity for personalization shape our preferences for either AI or humans.
AI tool generates high-quality images faster than state-of-the-art approaches
“We’ve shown that just one very elegant equation, rooted in the science of information, gives you rich algorithms spanning 100 years of research in machine learning. During the development of HART, the researchers encountered challenges in effectively integrating the diffusion model to enhance the autoregressive model. Because the diffusion model de-noises all pixels in an image at each step, and there may be 30 or more steps, the process is slow and computationally expensive. The generation process consumes fewer computational resources than typical diffusion models, enabling HART to run locally on a commercial laptop or smartphone. Their hybrid image-generation tool uses an autoregressive model to quickly capture the big picture and then a small diffusion model to refine the details of the image. In 2017, researchers at Google introduced the transformer architecture, which has been used to develop large language models, like those that power ChatGPT.
New models often consume more energy for training, since they usually have more parameters than their predecessors. Companies release new models every few weeks, so the energy used to train prior versions goes to waste, Bashir adds. With traditional AI, the energy usage is split fairly evenly between data processing, model training, and inference, which is the process of using a trained model to make predictions on new data. The pace at which companies are building new data centers means the bulk of the electricity to power them must come from fossil fuel-based power plants,” says Bashir. While not all data center computation involves generative AI, the technology has been a major driver of increasing energy demands. Scientists have estimated that the power requirements of data centers in North America increased from 2,688 megawatts at the end of 2022 to 5,341 megawatts at the end of 2023, partly driven by the demands of generative AI.
Schools
Konstantin Rusch and Daniela Rus have developed what they call “linear oscillatory state-space models” (LinOSS), which leverage principles of forced harmonic oscillators — a concept deeply rooted in physics and observed in biological neural networks. One new type of AI model, called “state-space models,” has been designed specifically to understand these sequential patterns more effectively. MIT neuroscientists find a surprising parallel in the ways humans and new AI models solve complex problems. AI supports the clean energy transition as it manages power grid operations, helps plan infrastructure investments, guides development of novel materials, and more. Large language models can learn to mistakenly link certain sentence patterns with specific topics — and may then repeat these patterns instead of reasoning. In this context, papers that unify and connect existing algorithms are of great importance, yet they are extremely rare.
- While not all data center computation involves generative AI, the technology has been a major driver of increasing energy demands.
- Diffusion models were introduced a year later by researchers at Stanford University and the University of California at Berkeley.
- By leveraging natural language, the system makes design and manufacturing more accessible to people without expertise in 3D modeling or robotic programming.
New algorithm unlocks high-resolution insights for computer vision
What all of these approaches have in common is that they convert inputs into a set of tokens, which are numerical representations of chunks of data. This attention map helps the transformer understand context when it generates new text. In natural language processing, a transformer encodes each word in a corpus of text as a token and then generates an attention map, which captures each token’s relationships stockholders equity calculator with all other tokens. This recurrence helps the model understand how to cut text into statistical chunks that have some predictability. And it has been trained on an enormous amount of data — in this case, much of the publicly available text on the internet.
Beyond electricity demands, a great deal of water is needed to cool the hardware used for training, deploying, and fine-tuning generative AI models, which can strain municipal water supplies and disrupt local ecosystems. A credit line must be used when reproducing images; if one is not provided below, credit the images to “MIT.” The team imagines that the emergence of a new paradigm like LinOSS will be of interest to machine learning practitioners to build upon. Empirical testing demonstrated that LinOSS consistently outperformed existing state-of-the-art models across various demanding sequence classification and forecasting tasks. Moreover, the researchers rigorously proved the model’s universal approximation capability, meaning it can approximate any continuous, causal function relating input and output sequences. “Our goal was to capture the stability and efficiency seen in biological neural systems and translate these principles into a machine learning framework,” explains Rusch.
MIT researchers “speak objects into existence” using AI and robotics
“The diffusion model has an easier job to do, which leads to more efficiency,” he adds. While this boosts the model’s speed, the information loss that occurs during compression causes errors when the model generates a new image. But because the model has multiple chances to correct details it got wrong, the images are high-quality. The ability to generate high-quality images quickly is crucial for producing realistic simulated environments that can be used to train self-driving cars to avoid unpredictable hazards, making them safer on real streets. Instead of having a model make an image of a chair, perhaps it could generate a plan for a chair that could be produced. The models have the capacity to plagiarize, and can generate content that looks like it was produced by a specific human creator, raising potential copyright issues.
Their tool, known as HART (short for hybrid autoregressive transformer), can generate images that match or exceed the quality of state-of-the-art diffusion models, but do so about nine times faster. Popular diffusion bank ratings for safety models, such as Stable Diffusion and DALL-E, are known to produce highly detailed images. By iteratively refining their output, these models learn to generate new data samples that resemble samples in a training dataset, and have been used to create realistic-looking images. This minimal overhead of the additional diffusion model allows HART to retain the speed advantage of the autoregressive model while significantly enhancing its ability to generate intricate image details. Because the diffusion model only predicts the remaining details after the autoregressive model has done its job, it can accomplish the task in eight steps, instead of the usual 30 or more a standard diffusion model requires to generate an entire image.
Noman Bashir, a fellow with the MIT Climate and Sustainability Consortium and a postdoc at CSAIL, speaks with Wired reporter Molly Taft about AI and energy consumption. While it is difficult to estimate how much power is needed to manufacture a GPU, a type of powerful processor that can handle intensive generative AI workloads, it would be more than what is needed to produce a simpler CPU because the fabrication process is more complex. Chilled water is used to cool a data center by absorbing heat from computing equipment. In a 2021 research paper, scientists from Google and the University of California at Berkeley estimated the training process alone consumed 1,287 megawatt hours of electricity (enough to power about 120 average U.S. homes for a year), generating about 552 tons of carbon dioxide. A data center is a temperature-controlled building that houses computing infrastructure, such as servers, data storage drives, and network equipment. Images for download on the MIT News office website are made available to non-commercial entities, press and the general public under a Creative Commons Attribution Non-Commercial No Derivatives license.