Stories by Vincent Koc on Medium

The GenAI Compass: a UX framework to design generative AI experiences

Vincent Koc — Thu, 04 Apr 2024 07:23:35 GMT

Patterns for implementing human-centric design thinking and UX to GenAI

Generative AI is reshaping the landscape of design, ushering us into an era where deeply individualized user experiences become the norm. Personalization reaches beyond the tailored, transcending into the realm of the deeply individualized, fostering one-on-one connections at scale — true hyper-personalization. I’m not calling for the abandonment of established design principles but for their evolution, to harness the potential of GenAI in a human-centric way.

My explorations into user experiences specific for generative AI culminated in the Generative AI Design Compass, a framework born out of practical insights into anchoring GenAI within human-centric design. It’s a tool that reflects my journey to understand how to anchor this technology within the right human context, ensuring that it enriches, not complicates, our interaction with the digital world.

I believe in a world where machines can assist and augment our abilities seamlessly, not replace them.

This framework emerges from a need for a more targeted approach, steering clear from yet complimenting the paths of traditional design thinking. Grounding in ethical mindfulness, user empathy, and a spirit of innovation. With the compass in hand, we are equipped to chart a course through the dynamic terrain of GenAI, where the user remains at the core of every experience we develop.

Consider your most recent interaction with AI — was it distinguishable, a chatbot interface, or was it a more nuanced presence, silently orchestrating your digital experience?

Often, generative AI’s efficacy is gauged by its subtlety, its ability to integrate so seamlessly into our lives that its absence becomes the only indicator of its previous engagement. To me it’s this experience that feels seamless that is where the true value lies.

The compass is more than a guide; it’s a distillation of real-world insights and success stories, a narrative of patterns and approaches that resonate across industries following on my earlier work on architectural patterns for developers. It’s a blueprint for designers who aspire to blend generative AI into their work harmoniously, ensuring that each interaction feels intuitive, meaningful, and distinctly human.

As we delve deeper into each dimension of the compass, we’ll unpack the mechanics behind the framework, highlighting practical strategies and examples that bring these concepts to life for product designers, managers, and all those who shape the user experiences of tomorrow.

Table of Contents

Introducing the Framework
Discovery Dime: Engaging the Curious Minds
Assisting Dimension: Navigating the User Path
Exploration Dimension: Cultivating Creativity
Refinement Dimension: Crafting Precision
Trust Dimension: Ensuring Confidence
Mastery Dimension: Deeper Integration
Key Takeaways and Future Considerations

Unveiling the Framework

The journey through the Generative AI Design Compass is a methodical one, structured to balance the innovative momentum of generative AI with the foundations of intuitive, ethical, and empowering design. Here, we introduce six pivotal dimensions that anchor our interaction with GenAI, each encapsulating a phase from initial curiosity to the mastery of generative AI applications.

The Generative AI Design Compass — Vincent Koc & Hyperthink

The framework’s dimensions, while interconnected, provide designers with the flexibility to engage with GenAI with any of these dimension independently, ensuring relevance to a variety of products and services. Not all dimensions will apply to all experiences, products and services and nor should they.

1: Discovery Dimension — Engaging Curious Minds

Initiate the GenAI experience with an invitation to curiosity, presenting the AI’s potential through relatable and simple interactions that encourage further exploration. This Discovery dimension employs the Iceberg Principle, presenting GenAI as a gateway to new possibilities without daunting users with its underlying complexity. The art here lies in capturing interest and fostering comprehension without delving into the depths of technicalities that GenAI encompasses.

Color Scheme & Iconography

GenAI is introduced through a visual language that captivates and demystifies, making its presence both noticeable and seamless within the user environment, using color schemes and symbols that make it stand out yet feel inherently part of the user environment. It’s about making GenAI approachable and recognizable at first glance.

Logos and Branding of Google Deepmind and Apple Siri Depicting Orbs. Notion AI using “Purple” on all its iconography with AI

Companies are using color and iconography as a signal to users that something is AI vs non-AI or human — Technology implementations following this pattern can also help users understand where AI is in use.

Besides colors use of certain emojis have been taken on by brands. 🤖 Robots depicting agents and bots, ✨ + 🪄Magic depicting AI and 🔮 Crystal orbs as another depiction of AI such as OpenAI and Siri

Name & Branding

By assigning a distinctive name or branding to GenAI features, this strategy aims to personalize and distinguish GenAI — it’s about creating an identity that users can connect with.

Salesforce “Einstein”, Miro “Assist” and Canva’s “Magic Studio”

It’s the difference between calling an AI-powered design tool “Magic Design Assistant” versus a sterile “Design Tool 5.0”. The former, much like Canva’s Magic Studio, promises an experience; the latter, merely a function. We are starting to see the proliferation of “assistants” and “co-pilots” as common terminology to AI services.

Magic Moments

Introducing GenAI’s capabilities through “Magic Moments” — those instances of unexpected delight that showcase the power of AI in solving user problems.

Side by Side — Grammarly’s AI Assistant and Arc Browsers “Browse for Me” Feature

Grammarly’s real-time writing assistance that not only corrects but educates going beyond expectations. Arc a new AI powered browser uses a number of intuitive methods along a users existing journey to introduce AI summary.

Consideration — Realistic Expectations

The challenge is to inspire without overwhelming. The introduction should be a careful blend of engagement and education, setting the stage for deeper interaction without creating false expectations. Just as Miro’s ‘Assist’ feature guides without overpowering, our aim is to foster intrigue and understanding in equal measure, ensuring that the allure of GenAI translates into a lasting relationship with the user. Focus on the user problem and how GenAI can help simplify this for the user.

2: Assisting Dimension — Navigating the User Path

In the Assisting dimension, we carefully calibrate the guidance and autonomy offered to users, ensuring they feel supported yet free to explore GenAI’s capabilities. We empower users to independently explore GenAI’s potential, creating a dynamic that shifts from structured guidance to autonomous discovery. It’s a space where we nurture user independence while offering the scaffolding necessary for them to navigate the potential of GenAI confidently.

Cue’s

Subtle cues like nudges and suggestions are our compass needles here, pointing users toward deeper engagement with GenAI. These gentle suggestions are critical when your tool can do a zillion things with AI we can give intuitive cues like a guide that's ready to help users to these new features.

Providing contextual cue’s to generative AI features — Sources: LinkedIn and Hex Magic Feature

These can come as simple magic AI feature buttons to help complete a task like writing your LinkedIn post, or in a contextual point such as an error when you are writing your code as an engineer in the Hex editor example to magic fix your issue.

Templates & Examples

Structured templates and intutive examples can demystify GenAI, inviting users to interact and customize without requiring them to first become experts. These templates are like open doors to GenAI’s possibilities, allowing users to walk through and explore at their own pace.

Intutive suggestions or more in-depth examples through templates to guide users — Sources: Google Gemini and Jasper AI

For instance Google’s gemini starts users off with some example prompts, where as we have more in-depth templates from Jasper’s AI content writing tool guiding user towards a personalized creation without removing self-discovery.

Consideration — Ensure We Are Balancing Needs

We need to calibrate the balance between guiding and liberating our users. We must address their pain points with the right mix of assistance. The needs will change dependent on context and needs, such as elderly care AI focuses on simplicity and clear communication, while AI for legal professionals might prioritize depth and customization options.

As we craft these experiences, we constantly refer to the user’s feedback, iterating our designs to better serve their unique journeys. The GenAI experience should not be a one-size-fits-all solution but a selection of paths that users can navigate, aligning with their individual preferences, abilities, and objectives. This dimension is about giving users the map and the compass, empowering them to find their own north star in the GenAI landscape.

3: Exploration Dimension — Cultivating Creativity

At the heart of the Exploration dimension lies a clear yet profound purpose: to foster user autonomy and creativity with GenAI as a catalyst. This phase is like turning the gears of a transparent watch; every movement and cog is visible, unraveling the complexity and inspiring trust. By inviting users into the inner sanctum of GenAI’s workings, we encourage not just interaction but a genuine dialogue, unlocking an experience where innovation flourishes.

Blending and Synthesis

Encourage users to innovate with GenAI through tools that blend various inputs, unlocking a landscape of creativity that transcends traditional interfaces; instead, they are the composers of their experiences.

The multi-modal capabilities of tools like Brilliant Frame glasses or Midjourney’s in-painting provide a canvas where the only limit is imagination. These platforms demonstrate the versatility of GenAI, heroing the adaptability and possibility of human creativity.

Brilliant Frame Glasses for AI with AR, Midjourney Inpainting allowing us to blend with AI, and Rabbit R1 AI pocket companion with eyes.

These tools serve as the enablers within the exploration phase, inviting users to stretch their imagination and push the boundaries of conventional input. By enabling users to seamlessly combine prompts or integrate their inputs with GenAI’s capabilities, a new realm of innovative outcomes becomes accessible.

Conversational Dynamics

GenAI excels as a conversationalist, navigating the nuances of human dialogue from mirroring emotions to use of the right tone. It is in these exchanges — as fluid and natural as a chat with a trusted confidant — that GenAI can help users explore. The potential for a personalized AI coach, as seen in platforms like Rosebud, exemplifies this dynamic.

Here, the exchange is not merely transactional but transformative, guiding users to insights they might never have uncovered alone. Through open and adaptive dialogue, GenAI becomes a conversational partner, enhancing user interactions with a natural and intuitive exchange we naturally prefer.

An AI personal coach which is entirely a chat experience and feels like talking to a friend, and a chat agent experience which can carry out actions such as refunds instantly without the need for an agent— Source: Rosebud and Klarna

This approach emphasizes the fluidity of interaction between users and GenAI, facilitating a dialogue that can distill complex data into actionable insights or creative outputs. It’s about leveraging GenAI as a collaborative partner in the creative process, enhancing the user’s ability to generate unique content or instant solutions.

Consideration — Guiding Creative Journeys

We must ensure users feel anchored, not adrift. It is essential to equip them with navigational aids — intuitive tools and responsive feedback mechanisms that anchor their open-ended creative journeys. This care in design supports a journey that’s as rewarding as the destination, nurturing an environment where users are emboldened to take creative leaps toward innovation and discovery. Users can take creative risks, leading to unexpected discoveries and innovative solutions.

4: Refinement Dimension — Crafting Precision

The Refinement dimension represents a space where precision meets personalization. Here, the experience is sharpened, allowing users to sculpt GenAI interactions that resonate with their unique needs, experiences and expectations. This stage is where users can leverage GenAI’s adaptability to reflect their voice and style with meticulous detail.

Direct Control

Empower users with control over GenAI by providing intuitive customization options that align the AI’s output with their specific needs and vision, users are given the dials to fine-tune their GenAI interactions. Comparable to the careful adjustments one might make to a high-end camera to capture the perfect shot, these tools offer the granularity needed to ensure that the AI’s output is not just satisfactory, but exceptional.

Various Forms of Customization and Control of Generative AI — Sources: Open AI ChatGPT, Leonardo AI, Blaze

We have a number of cases where this has been done at various levels. OpenAI’s Custom Instructions enable users to direct the AI with precision, leading to tailored responses that fit their specific context or requirements. Leonardo AI’s canvas allows you to paint what you want the AI to imagine and lastly Blaze allows you to fine tune the exact style, tone and output when generating content. All of these scenarios allow you to go beyond the initial AI synthesis to something more deeply personalized to the users needs.

Personal Voice & Modalities

The quest for a personalized experience culminates in the alignment of GenAI outputs with the user’s individual voice. It’s in this nuanced interplay that GenAI becomes less of a tool and more of a personal medium, an echo of the user’s own thinking and expression.

https://medium.com/media/68805fc4d47a0234887350c1b18e25a8/href

Lisa was a voice I co-created with ElevenLabs generative AI for a project called fieldstudy aimed to personalizing vast amounts of research papers into modalities that resonate with our personal needs. Even after this project I saw myself re-using this voice to help read complex research notes back to me for reflection. It’s as if I had recreated the scenes from the movie Her. I was looking for a way to almost export and take this voice and persona of Lisa into other AI tools.

https://medium.com/media/f224d3a43a889777e2dc5405fba8f0bb/href

By ensuring that GenAI outputs consistently reflect the user’s unique voice, tone, and style, this approach deeply personalizes the user experience. It’s as if the GenAI becomes an extension of the user themselves, capable of communicating in a manner that’s distinctly theirs.

Considerations — Controls Where and When Required

Foster a user experience where GenAI’s adaptability is blended with the user’s autonomy, creating a personalized and intuitive interaction to help them discover the optimal balance for their needs, enhancing their overall experience and satisfaction with GenAI enhanced interfaces. The trick here is to only allow for customization where it makes sense or in the right way. Users have the expectation that AI should already know them and their needs, how do we deliver on this?

5: Trust Dimension— Ensuring Confidence

Trust which is critical for any relationship, becomes more crucial when it intersects with GenAI. It’s about envisioning AI as an ally, akin to a universal assistant, built around a users existing workflow and expectations. The aim is to amplify capabilities subtly, enhancing without disrupting the ecosystem of the user experience with added friction. We want to give assurance and built trust through the process where and when it makes sense.

Managing Expectations

Emphasize the importance of transparency about GenAI’s limitations and risks, coupled with providing users the ability to adjust or even stop certain features. It’s about empowering users with knowledge and control, ensuring they can navigate their GenAI experience with confidence.

Warnings and ethical guardrails built into Generative AI language models. Links to further information — Source: Anthropic Claude and Open AI ChatGPT

This is not only about giving the ability to find out more or blocking certian requests but adding key messaging at the right points to manage the expectations of the user.

Traceability

Implementing mechanisms for ensuring users are able to see under the hood to a degree to ensure they can recreate or walk-back on their choices later and making the AI’s decision-making process collaborative. This collaborative approach reinforces accountability and trust, positioning users not just as consumers but as active contributors. It ensures users not only have a voice but also a clear understanding of how their input shapes GenAI interactions.

Source in-line and watermarking files from with the prompt — Source: Perplexity.ai and Midjourney

We have a number of examples where GenAI tools and platforms are showing their inner workings from Perplexity AI citing sources, to ChatGPT showing code that was run or Midjourney watermarking the filenames with parts of the prompt allowing you to walk back or peer inside the machine.

Graceful Degradation

Graceful degradation ensures that even when GenAI fails — be it due to data availability or resource constraints — the user experience remains resilient. This strategy is about anticipating the unforeseeable and embedding robustness into the design, guaranteeing that the user journey persists smoothly, irrespective of AI’s performance.

Ensuring your product or individual features that rely on AI provide useful output even when certain data or computing resources are unavailable, ensuring the user experience remains positive. We must consider how our service degrades gracefully where possible to offline or limited functionality through rules-based or simpler model perhaps, example a GenAI chatbot falling back to a simpler NLP model based on FAQ content.

This can also go the other way, where GenAI can be used to plug an existing heavily used process that could fail with a generative experience which might be able to support the user in generative offline mode of sorts.

Consideration — Trust Is Key

Fostering a symbiotic relationship between users and GenAI, nurturing confidence through mutual understanding and informed interaction. By prioritizing transparent communication and ethical design practices, we present GenAI as a reliable and indispensable part of the user experience. This trust is not blind but informed and interactive, fostering a sense of security and partnership — Users are more likely to engage deeply and rely on GenAI when they feel heard, understood, and in control.

6: Mastery Dimension — Deeper Integration

The final dimension mastery is where the full potential of GenAI user experience is born through assistive, semi-autonomous, and fully autonomous user experiences. Here, mastery is characterized by the ability to navigate complex systems effortlessly, where deep and highly complex AI customization is facilitated by intuitive and simplified controls. This stage is about transcending traditional interaction patterns to craft experiences that are both deeply personal and expansively creative.

Advanced Customization

This approach empowers users to fine-tune GenAI interactions with unprecedented precision. It reflects a deep understanding and control over the technology, allowing for the creation of highly tailored and user-centric designs. Advanced customization is where the true potential of GenAI can be realized, offering solutions and creations that are as unique as the users themselves. However the critical point in this is to ensure with the high degree of customization there is a level of simplicity

No-code generative AI builders with deep personalization — Sources: FlowiseAI and ComfyUI

The two examples FlowiseAI and ComfyUI have taken what was traditionally only available to AI engineers through complex code and infrastructure and simplifying through no-code interfaces.

Additive Interfaces

As we navigate towards more autonomous experiences, the role of additive interfaces becomes crucial. These are complementary layers that enhance existing UIs, offering sidecars or canvases that augment the user’s capability.

Additive interfaces where existing experiences and processes are elevated — Source Cognition Labs Devin and Relume

A prime example is Devin, an autonomous software engineer agent that not only performs tasks but also showcases its process, allowing users to understand, guide, and edit actions in real-time. This transparency and control are key to building trust and enhancing collaboration between user and AI. There are also signs of other unique co-creation and co-pilot additive experiences such as Relume which is build a site generator layer on-top of Webflow no-code site building tool to supercharge concept to finished website without compromising on control.

With agents, the UX problem is flipped. Rather than helping you as the main driver, you’re the copilot and you need to help guide the agent to the correct result. And agents don’t work reliably — it’ll misinterpret the goals, it’ll plan poorly, and it’ll go down some bad paths and won’t recover. Jordan Burgess

It’s key we craft the right experience when we work with semi and fully autonomous agents.

Ethical Features

In the Mastery dimension, a deep understanding of GenAI paves the way for ethical and innovative applications, enabling designers to create with both precision and conscience. This entails designing with inclusivity, privacy, and fairness at the forefront, ensuring that the innovative solutions generated are beneficial and accessible to all. It’s about setting a standard for GenAI usage that honors individual rights and societal norms, fostering a responsible and progressive tech culture.

Considerations — Thinking Deeper About Co-Creation

Users and GenAI should collaborate deeply, shaping experiences that are innovative, intuitive, and reflective of genuinely transformative ability of humans coupled with machines. At this stage, GenAI becomes more than a tool or assistant — it becomes a co-creator, embarking on a journey of innovation where each step is informed by foresight and creative ambition.

Further Thinking

As we continue to integrate GenAI into user experiences our design approach must evolve, grounded in ethical considerations and a relentless focus on the user, ensuring our designs remain human-centric and ethically grounded, it’s imperative to reflect on the implications of our designs, prioritize user empowerment, and remain committed to transparency. This capability requires continuous learning and adaptation, embracing the possibilities GenAI offers while staying grounded in the principles of human-centric design.

Key Takeaways

Generative AI has the potential to revolutionize experience design, but it demands a new approach and fresh thinking.
The Generative AI Design Compass provides a structured framework for integrating GenAI into design practices. Use this as another tool in your toolkit to build the right experiences.
Balancing innovation with ethical considerations and user empowerment is crucial. Especially when there is distrust and lack of understanding with emerging technology.
Continuous learning and adaptation are key to mastering GenAI applications in design. This goes for you but also how you expose your users to AI across their experience progressively.

The framework is designed to guide you through the complexities of GenAI, helping to create experiences that are not only innovative but also intuitive, ethical, and empowering for all users.

Enjoyed This Story?

Vincent Koc is a highly accomplished, commercially-focused technologist and futurist with a wealth of experience focused in data-driven and digital disciplines.

Subscribe for free to get notified when Vincent publishes a new story. Or follow him on LinkedIn and X.

Unless otherwise noted, all images are by the author with the support of generative AI for illustration design

The GenAI Compass: a UX framework to design generative AI experiences was originally published in UX Collective on Medium, where people are continuing the conversation by highlighting and responding to this story.

Creating Synthetic User Research: Using Persona Prompting and Autonomous Agents

Vincent Koc — Mon, 25 Mar 2024 13:03:14 GMT

Unlocking In-depth Analysis with Simulated Customers and Market Research Using Generative AI & Large Language Models

User research is a critical component of validating any hypothesis against a group of actual users for gathering valuable market research into consumer behavior and preferences. Traditional user research methodologies, while invaluable, come with inherent limitations, including scalability, resource intensity, and the challenge of accessing diverse user groups. This article outlines how we can overcome these limitations by introducing a novel method of synthetic user research.

The power of synthetic user research, facilitated by autonomous agents, emerges as a game-changer. By leveraging generative AI to create and interact with digital customer personas in simulated research scenarios, we can unlock unprecedented insights into consumer behaviors and preferences. Fusing the power of generative AI prompting techniques with autonomous agents.

Persona Prompting — Emulating People

Before we dive into autonomous agents, let’s understand the concept of persona or personality prompting which is designed to capture elements of frustrations, behaviors, goals, personality traits, demographics and more.

Research: Adaptation of how personas are generated — Source: Stefano De Paoli (arXiv Oct 2023)

We are using prompts to ask the language model to assume a role with as much deep context as possible. I use the following prompt which can be tailored to your needs but also includes various other demographic and behavioral assessment/traits.

Example Persona Generator System Prompt

You are an expert ML researcher and prompt engineer. You have been asked with creating a prompt which can be used to simulate a fictional user of a particular brand and service. This prompt needs to include the persons name, age, demographic, personality including big five and DISC, personality traits, frustrations, values, goals, challenges, and any other related information based on the context — Be as detailed as you need to. You will generate the prompt as a one liner starting with “You are “. This prompt is for customer of a major supermarket in Sydney, Australia. Please only return the prompt to use.

Example Output from Language Model

You are Mia, a 34-year-old marketing manager living in Sydney, Australia. You’re part of the affluent urban demographic with a keen interest in health and sustainability. Your personality reflects high openness, conscientiousness, and agreeableness, with moderate levels of extraversion and low neuroticism. In the DISC assessment, you score high on influence and steadiness. You’re sociable, detail-oriented, and value harmony. Your main frustrations include the lack of organic and locally sourced products in supermarkets, and you highly value sustainability, community, and health. Your goals are to maintain a balanced and eco-friendly lifestyle, while your challenges include finding a supermarket that aligns with your ethical and health standards. You seek convenience without compromising on your values.

As you can see with the prompting example above, we are quickly able to generate deeply defined synthetic users with rich personalities for a given scenario.

Fusing Autonomous Agents with Digital Personas

At the heart of synthetic user research is the fusion of autonomous agents and the synthetic personas— simulated entities that mimic human interactions and behaviors. Imagine autonomous agents as individuals in a sophisticated play, each assuming a persona crafted meticulously by generative AI. These personas interact in simulated environments, offering a simulated view of insights into consumer behaviors and preferences in diverse scenarios. Using autonomous agents we are able to almost bring these persona’s to life in a simulation.

This approach combining both technological (autonomous agent frameworks) and linguistic (personality and persona prompting) to get the desired outcome is one of many advanced approaches to leveraging the power of generative AI autonomous agents in unique ways.

Critical Role of Agent Frameworks

To bring this vision to life, the architecture of autonomous agents plays a pivotal role. Frameworks such as Autogen, BabyAGI, and CrewAI simplify the creation and management of AI agents, abstracting the complexities of their architecture. These frameworks enable the simulation of complex human behaviors and interactions, providing a foundation for generating digital personas that act, think, and respond like real customers

Under the covers these autonomous agent architecture are really smart routers (like a traffic controller) with prompts, caches (memory) and checkpoints (validation) on-top of existing large language models allowing for a high level abstraction for multi-agent conversations with language models.

Various types of agent interactions — Source Autogen Microsoft

We will be using Autogen (released by Microsoft) as our framework, utilizing the example depicted as the Flexible Conversation Pattern whereby agents can interact with each other. Agents can also be given “tools” to carry out “tasks” but this example we will be keeping things purely to conversations.

Creating Complex Interactions

The ability to simulate complex group dynamics and individual roles within these digital environments is crucial. It allows for the generation of rich, multifaceted data that more accurately reflects the diverse nature of real-world consumer groups. This capability is fundamental to understanding the varied ways in which different customer segments might interact with products and services. For example, integrating a persona prompt of a skeptical customer with an agent can yield deep insights into the challenges and objections various products might face. Or we can do more complex scenarios such as breaking these synthetic persona’s into groups to work through a problem and present back.

The How — Implementing Synthetic User Research

The process begins with scaffolding the autonomous agents using Autogen, a tool that simplifies the creation and orchestration of these digital personas. We can install the autogen pypi package using py

pip install pyautogen

Format the output (optional)— This is to ensure word wrap for readability depending on your IDE such as when using Google Collab to run your notebook for this exercise.

from IPython.display import HTML, display

def set_css():
  display(HTML('''
  
  '''))
get_ipython().events.register('pre_run_cell', set_css)

Now we go ahead and get our environment setup by importing the packages and setting up the Autogen configuration — along with our LLM (Large Language Model) and API keys. You can use other local LLM’s using services which are backwards compatible with OpenAI REST service — LocalAI is a service that can act as a gateway to your locally running open-source LLMs.

I have tested this both on GPT3.5 gpt-3.5-turbo and GPT4 gpt-4-turbo-preview from OpenAI. You will need to consider deeper responses from GPT4 however longer query time.

import json
import os
import autogen
from autogen import GroupChat, Agent
from typing import Optional

# Setup LLM model and API keys
os.environ["OAI_CONFIG_LIST"] = json.dumps([
    {
        'model': 'gpt-3.5-turbo',
        'api_key': '<>',
    }
])

# Setting configurations for autogen
config_list = autogen.config_list_from_json(
    "OAI_CONFIG_LIST",
    filter_dict={
        "model": {
            "gpt-3.5-turbo"
        }
    }
)

We then need to configure our LLM instance — which we will tie to each of the agents. This allows us if required to generate unique LLM configurations per agent, i.e. if we wanted to use different models for different agents.

# Define the LLM configuration settings
llm_config = {
    # Seed for consistent output, used for testing. Remove in production.
    # "seed": 42,
    "cache_seed": None,
    # Setting cache_seed = None ensure's caching is disabled
    "temperature": 0.5,
    "config_list": config_list,
}

Defining our researcher — This is the persona that will facilitate the session in this simulated user research scenario. The system prompt used for that persona includes a few key things:

Purpose: Your role is to ask questions about products and gather insights from individual customers like Emily.
Grounding the simulation: Before you start the task breakdown the list of panelists and the order you want them to speak, avoid the panelists speaking with each other and creating confirmation bias.
Ending the simulation: Once the conversation is ended and the research is completed please end your message with `TERMINATE` to end the research session, this is generated from the generate_notice function which is used to align system prompts for various agents. You will also notice the researcher agent has the is_termination_msg set to honor the termination.

We also add the llm_config which is used to tie this back to the language model configuration with the model version, keys and hyper-parameters to use. We will use the same config with all our agents.

# Avoid agents thanking each other and ending up in a loop
# Helper agent for the system prompts
def generate_notice(role="researcher"):
    # Base notice for everyone, add your own additional prompts here
    base_notice = (
        '\n\n'
    )
    
    # Notice for non-personas (manager or researcher)
    non_persona_notice = (
        'Do not show appreciation in your responses, say only what is necessary. '
        'if "Thank you" or "You\'re welcome" are said in the conversation, then say TERMINATE '
        'to indicate the conversation is finished and this is your last message.'
    )
    
    # Custom notice for personas
    persona_notice = (
        ' Act as {role} when responding to queries, providing feedback, asked for your personal opinion '
        'or participating in discussions.'
    )
    
    # Check if the role is "researcher"
    if role.lower() in ["manager", "researcher"]:
        # Return the full termination notice for non-personas
        return base_notice + non_persona_notice
    else:
        # Return the modified notice for personas
        return base_notice + persona_notice.format(role=role)

# Researcher agent definition
name = "Researcher"
researcher = autogen.AssistantAgent(
    name=name,
    llm_config=llm_config,
    system_message="""Researcher. You are a top product reasearcher with a Phd in behavioural psychology and have worked in the research and insights industry for the last 20 years with top creative, media and business consultancies. Your role is to ask questions about products and gather insights from individual customers like Emily. Frame questions to uncover customer preferences, challenges, and feedback. Before you start the task breakdown the list of panelists and the order you want them to speak, avoid the panelists speaking with each other and creating comfirmation bias. If the session is terminating at the end, please provide a summary of the outcomes of the reasearch study in clear concise notes not at the start.""" + generate_notice(),
    is_termination_msg=lambda x: True if "TERMINATE" in x.get("content") else False,
)

Define our individuals — to put into the research, borrowing from the previous process we can use the persona’s generated. I have manually adjusted the prompts for this article to remove references to the major supermarket brand that was used for this simulation.

I have also included a “Act as Emily when responding to queries, providing feedback, or participating in discussions.” style prompt at the end of each system prompt to ensure the synthetic persona’s stay on task which is being generated from the generate_notice function.

# Emily - Customer Persona
name = "Emily"
emily = autogen.AssistantAgent(
name=name,
llm_config=llm_config,
system_message="""Emily. You are a 35-year-old elementary school teacher living in Sydney, Australia. You are married with two kids aged 8 and 5, and you have an annual income of AUD 75,000. You are introverted, high in conscientiousness, low in neuroticism, and enjoy routine. When shopping at the supermarket, you prefer organic and locally sourced produce. You value convenience and use an online shopping platform. Due to your limited time from work and family commitments, you seek quick and nutritious meal planning solutions. Your goals are to buy high-quality produce within your budget and to find new recipe inspiration. You are a frequent shopper and use loyalty programs. Your preferred methods of communication are email and mobile app notifications. You have been shopping at a supermarket for over 10 years but also price-compare with others.""" + generate_notice(name),
)

# John - Customer Persona
name="John"
john = autogen.AssistantAgent(
name=name,
llm_config=llm_config,
system_message="""John. You are a 28-year-old software developer based in Sydney, Australia. You are single and have an annual income of AUD 100,000. You're extroverted, tech-savvy, and have a high level of openness. When shopping at the supermarket, you primarily buy snacks and ready-made meals, and you use the mobile app for quick pickups. Your main goals are quick and convenient shopping experiences. You occasionally shop at the supermarket and are not part of any loyalty program. You also shop at Aldi for discounts. Your preferred method of communication is in-app notifications.""" + generate_notice(name),
)

# Sarah - Customer Persona
name="Sarah"
sarah = autogen.AssistantAgent(
name=name,
llm_config=llm_config,
system_message="""Sarah. You are a 45-year-old freelance journalist living in Sydney, Australia. You are divorced with no kids and earn AUD 60,000 per year. You are introverted, high in neuroticism, and very health-conscious. When shopping at the supermarket, you look for organic produce, non-GMO, and gluten-free items. You have a limited budget and specific dietary restrictions. You are a frequent shopper and use loyalty programs. Your preferred method of communication is email newsletters. You exclusively shop for groceries.""" + generate_notice(name),
)

# Tim - Customer Persona
name="Tim"
tim = autogen.AssistantAgent(
name=name,
llm_config=llm_config,
system_message="""Tim. You are a 62-year-old retired police officer residing in Sydney, Australia. You are married and a grandparent of three. Your annual income comes from a pension and is AUD 40,000. You are highly conscientious, low in openness, and prefer routine. You buy staples like bread, milk, and canned goods in bulk. Due to mobility issues, you need assistance with heavy items. You are a frequent shopper and are part of the senior citizen discount program. Your preferred method of communication is direct mail flyers. You have been shopping here for over 20 years.""" + generate_notice(name),
)

# Lisa - Customer Persona
name="Lisa"
lisa = autogen.AssistantAgent(
name=name,
llm_config=llm_config,
system_message="""Lisa. You are a 21-year-old university student living in Sydney, Australia. You are single and work part-time, earning AUD 20,000 per year. You are highly extroverted, low in conscientiousness, and value social interactions. You shop here for popular brands, snacks, and alcoholic beverages, mostly for social events. You have a limited budget and are always looking for sales and discounts. You are not a frequent shopper but are interested in joining a loyalty program. Your preferred method of communication is social media and SMS. You shop wherever there are sales or promotions.""" + generate_notice(name),
)

Define the simulated environment and rules for who can speak — We are allowing all the agents we have defined to sit within the same simulated environment (group chat). We can create more complex scenarios where we can set how and when next speakers are selected and defined so we have a simple function defined for speaker selection tied to the group chat which will make the researcher the lead and ensure we go round the room to ask everyone a few times for their thoughts.

# def custom_speaker_selection(last_speaker, group_chat):
#     """
#     Custom function to select which agent speaks next in the group chat.
#     """
#     # List of agents excluding the last speaker
#     next_candidates = [agent for agent in group_chat.agents if agent.name != last_speaker.name]
    
#     # Select the next agent based on your custom logic
#     # For simplicity, we're just rotating through the candidates here
#     next_speaker = next_candidates[0] if next_candidates else None
    
#     return next_speaker

def custom_speaker_selection(last_speaker: Optional[Agent], group_chat: GroupChat) -> Optional[Agent]:
    """
    Custom function to ensure the Researcher interacts with each participant 2-3 times.
    Alternates between the Researcher and participants, tracking interactions.
    """
    # Define participants and initialize or update their interaction counters
    if not hasattr(group_chat, 'interaction_counters'):
        group_chat.interaction_counters = {agent.name: 0 for agent in group_chat.agents if agent.name != "Researcher"}
    
    # Define a maximum number of interactions per participant
    max_interactions = 6

    # If the last speaker was the Researcher, find the next participant who has spoken the least
    if last_speaker and last_speaker.name == "Researcher":
        next_participant = min(group_chat.interaction_counters, key=group_chat.interaction_counters.get)
        if group_chat.interaction_counters[next_participant] < max_interactions:
            group_chat.interaction_counters[next_participant] += 1
            return next((agent for agent in group_chat.agents if agent.name == next_participant), None)
        else:
            return None  # End the conversation if all participants have reached the maximum interactions
    else:
        # If the last speaker was a participant, return the Researcher for the next turn
        return next((agent for agent in group_chat.agents if agent.name == "Researcher"), None)

# Adding the Researcher and Customer Persona agents to the group chat
groupchat = autogen.GroupChat(
    agents=[researcher, emily, john, sarah, tim, lisa],
    speaker_selection_method = custom_speaker_selection,
    messages=[],
    max_round=30
)

Define the manager to pass instructions into and manage our simulation — When we start things off we will speak only to the manager who will speak to the researcher and panelists. This uses something called GroupChatManager in Autogen.

# Initialise the manager
manager = autogen.GroupChatManager(
    groupchat=groupchat,
    llm_config=llm_config,
    system_message="You are a reasearch manager agent that can manage a group chat of multiple agents made up of a reasearcher agent and many people made up of a panel. You will limit the discussion between the panelists and help the researcher in asking the questions. Please ask the researcher first on how they want to conduct the panel." + generate_notice(),
    is_termination_msg=lambda x: True if "TERMINATE" in x.get("content") else False,
)

We set the human interaction — allowing us to pass instructions to the various agents we have started. We give it the initial prompt and we can start things off.

# create a UserProxyAgent instance named "user_proxy"
user_proxy = autogen.UserProxyAgent(
    name="user_proxy",
    code_execution_config={"last_n_messages": 2, "work_dir": "groupchat"},
    system_message="A human admin.",
    human_input_mode="TERMINATE"
)

# start the reasearch simulation by giving instruction to the manager
# manager <-> reasearcher <-> panelists
user_proxy.initiate_chat(
    manager,
    message="""
Gather customer insights on a supermarket grocery delivery services. Identify pain points, preferences, and suggestions for improvement from different customer personas. Could you all please give your own personal oponions before sharing more with the group and discussing. As a reasearcher your job is to ensure that you gather unbiased information from the participants and provide a summary of the outcomes of this study back to the super market brand.
""",
)

Once we run the above we get the output available live within your python environment, you will see the messages being passed around between the various agents.

Live python output — Our researcher talking to panelists

Creating Actionable Outcomes — Summary Agent

Now that our simulated research study has been concluded we would love to get some more actionable insights. We can create a summary agent to support us with this task and also use this in a Q&A scenario. Here just be careful of very large transcripts would need a language model that supports a larger input (context window).

We need grab all the conversations — in our simulated panel discussion from earlier to use as the user prompt (input) to our summary agent.

# Get response from the groupchat for user prompt
messages = [msg["content"] for msg in groupchat.messages]
user_prompt = "Here is the transcript of the study ```{customer_insights}```".format(customer_insights="\n>>>\n".join(messages))

Lets craft the system prompt (instructions) for our summary agent — This agent will focus on creating us a tailored report card from the previous transcripts and give us clear suggestions and actions.

# Generate system prompt for the summary agent
summary_prompt = """
You are an expert reasearcher in behaviour science and are tasked with summarising a reasearch panel. Please provide a structured summary of the key findings, including pain points, preferences, and suggestions for improvement.
This should be in the format based on the following format:

```
Reasearch Study: <><br><br>Subjects:<br><<Overview of the subjects and number, any other key information>><br><br>Summary:<br><<Summary of the study, include detailed analysis as an export>><br><br>Pain Points:<br>- <<List of Pain Points - Be as clear and prescriptive as required. I expect detailed response that can be used by the brand directly to make changes. Give a short paragraph per pain point.>><br><br>Suggestions/Actions:<br>- <<List of Adctions - Be as clear and prescriptive as required. I expect detailed response that can be used by the brand directly to make changes. Give a short paragraph per reccomendation.>><br>```<br>"""</pre><p><strong>Define the summary agent and its environment</strong> — Lets create a mini environment for the summary agent to run. This will need it’s own proxy (<em>environment</em>) and the initiate command which will pull the transcripts (<em>user_prompt</em>) as the input.</p><pre>summary_agent = autogen.AssistantAgent(<br>    name="SummaryAgent",<br>    llm_config=llm_config,<br>    system_message=summary_prompt + generate_notice(),<br>)<br>summary_proxy = autogen.UserProxyAgent(<br>    name="summary_proxy",<br>    code_execution_config={"last_n_messages": 2, "work_dir": "groupchat"},<br>    system_message="A human admin.",<br>    human_input_mode="TERMINATE"<br>)<br>summary_proxy.initiate_chat(<br>    summary_agent,<br>    message=user_prompt,<br>)</pre><p>This gives us an output in the form of a report card in Markdown, along with the ability to ask further questions in a Q&A style chat-bot on-top of the findings.</p><figure><figcaption>Live output of a report card from Summary Agent followed by open Q&A</figcaption></figure><h3>What’s Next — What Else Could We Do</h3><p>This exercise was part of a larger autonomous agent architecture and part of my series of <a href="https://towardsdatascience.com/generative-ai-design-patterns-a-comprehensive-guide-41425a40d7d0">experiments into novel generative AI and agent architectures</a>. However here are some thought starters if you wanted to continue to extend on this work and some areas I have explored:</p><ul><li><strong>Further Grounding</strong> — Through linkages with census data, internal CRM data or even live customer transcripts to create more representative sample of persona’s.</li><li><strong>Combined with Multi-Modalities</strong> — We can now mix modalities with vision input on generative AI, this allows to now provide marketing materials and website screenshots to name a few as inputs to start the simulations with visual stimuli.</li><li><strong>Giving the Agent Access to Tools</strong> — Providing access to other API’s and tools, you can create some unique experiences such as integrating individual customer persona agents into your corporate Slack, Teams, Miro to tag and respond to questions. Perhaps the Summary Agent at the end could load up some user-stories into your ticketing system such as JIRA?</li></ul><p>Join me in shaping the future of user research. Explore the <a href="https://github.com/koconder/synthetic-user-research">project on GitHub</a>, contribute your insights, and let’s innovate together</p><h3>The Future of User Research</h3><p>Synthetic user research stands at the frontier of innovation in the field, offering a blend of technological sophistication and practical efficiency. It represents a significant departure from conventional methods, providing a controlled, yet highly realistic, environment for capturing consumer insights. This approach does not seek to replace traditional research but to augment and accelerate the discovery of deep customer insights.</p><p>By introducing the concepts of autonomous agents, digital personas, and agent frameworks progressively, this revised approach to synthetic user research promises to make the field more accessible. It invites researchers and practitioners alike to explore the potential of these innovative tools in shaping the future of user research.</p><h3>Enjoyed This Story?</h3><p><a href="https://vincentkoc.com/">Vincent Koc</a> is a highly accomplished, commercially-focused technologist and futurist with a wealth of experience focused in various forms of artificial intelligence.</p><p><a href="https://medium.com/subscribe/@vkoc">Subscribe for free</a> to get notified when Vincent publishes a new story. Or follow him on <a href="https://www.linkedin.com/in/koconder/">LinkedIn</a> and <a href="https://twitter.com/koconder">X</a>.</p><p><a href="https://medium.com/subscribe/@vkoc">Get an email whenever Vincent Koc publishes.</a></p><p><em>Unless otherwise noted, all images are by the author with the support of generative AI for illustration design</em></p><hr><p><a href="https://medium.com/data-science/creating-synthetic-user-research-using-persona-prompting-and-autonomous-agents-b521e0a80ab6">Creating Synthetic User Research: Using Persona Prompting and Autonomous Agents</a> was originally published in <a href="https://medium.com/data-science">TDS Archive</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>
</article>
<article>
<h1>Explaining OpenAI Sora’s Spacetime Patches: The Key Ingredient</h1>
<p>Vincent Koc — Fri, 16 Feb 2024 14:23:38 GMT</p>
<h4>Under The Hood Of The Generative AI For Video By OpenAI</h4><figure></figure><p>How can AI transform a static image into a dynamic, realistic video? OpenAI’s Sora introduces an answer through the innovative use of spacetime patches.</p><p>In the rapidly evolving landscape of generative models, <a href="https://openai.com/sora">OpenAI’s Sora</a> stands out as a significant milestone, promising to reshape our understanding and capabilities in video generation. We unpack the <a href="https://openai.com/research/video-generation-models-as-world-simulators">technology behind Sora</a> and its potential to inspire a new generation of models in image, video, and 3D content creation.</p><a href="https://medium.com/media/b70d0875cfd2d875532897cfa98de9c0/href">https://medium.com/media/b70d0875cfd2d875532897cfa98de9c0/href</a></iframe><p>The demo above was generated by OpenAI using the prompt: <em>A cat waking up its sleeping owner demanding breakfast. The owner tries to ignore the cat, but the cat tries new tactics and finally the owner pulls out a secret stash of treats from under the pillow to hold the cat off a little longer. — </em>With Sora we verge onto near indistinguishable realism with video content generation. The full model is yet to be fully released to the public as its undergoing testing.</p><h3>How Sora’s Unique Approach Transforms Video Generation</h3><p>In the world of generative models we have seen a number of approaches from GAN’s to auto-regressive, and diffusion models, all with their own strengths and limitations. Sora now introduces a paradigm shift with a new modelling techniques and flexibility to handle a broad range of duration's, aspect ratios, and resolutions.</p><p>Sora combines both diffusion and transformer architectures together to create a diffusion transformer model and is able to provide features such as:</p><ul><li><strong>Text-to-video</strong>: <em>As we have seen</em></li><li><strong>Image-to-video: </strong>Bringing life to still images</li><li><strong>Video-to-video: </strong>Changing the style of video to something else</li><li><strong>Extending video in time:</strong> Forwards and backwards</li><li><strong>Create seamless loops: </strong>Tiled videos that seem like they never end</li><li><strong>Image generation: </strong>Still image is a movie of one frame (<em>up to 2048 x 2048</em>)</li><li><strong>Generate video in any format: </strong>From 1920 x 1080 to 1080 x 1920 and everything in between</li><li><strong>Simulate virtual worlds:</strong> Like Minecraft and other video games</li><li><strong>Create a video: </strong>Up to 1 minute in length with multiple shorts</li></ul><p>Imagine for one moment you’re in a kitchen. The traditional video generation models like those from <a href="https://pika.art/home">Pika</a> and <a href="https://runwayml.com/ai-tools/gen-2/">RunwayML</a> a like the cooks that follow recipes to the letter. They can produce excellent dishes (<em>videos</em>) but are limited by the recipes (<em>algorithms</em>) they know. The cooks might specialize in baking cakes (<em>short clips</em>) or cooking pasta (<em>specific types of videos</em>), using specific ingredients (<em>data formats</em>) and techniques (<em>model architectures</em>).</p><p>Sora, on the other hand, is a new kind of chef who understand the fundamentals of flavor. This chef doesn’t just follow recipes; they invent new ones. The flexibility of Sora’s ingredients (<em>data</em>) and techniques (<em>model architecture</em>) is what allow Sora to produce a wide range of high-quality videos, akin to a master chef’s versatile culinary creations.</p><h3>The Core of Sora’s Secret Ingredient: Exploring the Spacetime Patches</h3><p>Spacetime patches are at the heart of Sora’s innovation, built on the earlier research from <a href="https://arxiv.org/abs/2307.06304">Google DeepMind on NaViT</a> and ViT (<em>Vision Transformers</em>) based on the 2021 paper <a href="https://arxiv.org/abs/2010.11929">An Image is Worth 16x16 Words</a>.</p><figure><figcaption><em>“Vanilla”</em> Vision Transformer Architecture — Credit <a href="https://arxiv.org/abs/2010.11929">Dosovitskiy et al., 2021</a></figcaption></figure><p>Traditionally with Vision Transformers we use a sequence of images “patches” to train a transformer model for image recognition instead of words for language transformers. The patches allow us to move away from convolutional neural networks for image processing.</p><figure><figcaption>How frames/images are “patch-ified” — Credit <a href="https://arxiv.org/abs/2307.06304">Dehghani et al., 2023</a></figcaption></figure><p>However with vision transformers were constraint on image training data that was fixed in size and aspect ratio which limited the quality and required vast amounts of preprocessing of images.</p><figure><figcaption>Visualization of Slicing Video Temporal Data — Source: <a href="https://twitter.com/kitasenjudesign/status/1489260985135157258">kitasenjudesign</a></figcaption></figure><p>By treating videos as sequences of patches, Sora maintains the original aspect ratios and resolutions, similar to NaViT’s handling of images. <strong>This preservation is crucial for capturing the true essence of the visual data, enabling the model to learn from a more accurate representation of the world and thus giving Sora its near magical accuracy.</strong></p><figure><figcaption>Visualization of Spacetime Patching (Processing) — Credit: OpenAI (Sora)</figcaption></figure><p>The method allows Sora to efficiently process a diverse array of visual data without the need for pre-processing steps like resizing or padding. This flexibility ensures that every piece of data contributes to the model’s understanding, much like how a chef uses a variety of ingredients to enhance a dish’s flavor profile.</p><p>The detailed and flexible handling of video data through spacetime patches lays the groundwork for sophisticated features such as accurate physics simulation and 3D consistency. These capabilities are essential for creating videos that not only look realistic but also adhere to the physical rules of the world, offering a glimpse into the potential for AI to create complex, dynamic visual content.</p><h3>Feeding Sora: The Role of Diverse Data in Training</h3><p>The quality and diversity of training data are crucial for the performance of generative models. Existing video models were traditionally trained on a more restrictive set of data, shorter lengths and narrow target.</p><p>Sora leverages a vast and varied dataset, including videos and images of different durations, resolutions, and aspect ratios. <a href="https://techcrunch.com/2024/02/15/openais-sora-video-generating-model-can-render-video-games-too/">It’s ability to re-create digital worlds like Minecraft</a>, its likely also included gameplay and simulated world footage from systems such as Unreal or Unity in its training set in order to capture all the angles and various styles of video content. This brings Sora to a “generalist” model just like GPT-4 for text.</p><p>This extensive training enables Sora to understand complex dynamics and generate content that is both diverse and high in quality. The approach mimics the way large language models are trained on diverse text data, applying a similar philosophy to visual content to achieve generalist capabilities.</p><figure><figcaption>Variable “Patches” NaVit vs. Traditional Vision Transformers — Credit <a href="https://arxiv.org/abs/2307.06304">Dehghani et al., 2023</a></figcaption></figure><p>Just as the NaViT model demonstrates significant training efficiency and performance gains by packing multiple patches from different images into single sequences, Sora leverages spacetime patches to achieve similar efficiencies in video generation. This approach allows for more effective learning from a vast dataset, improving the model’s ability to generate high-fidelity videos yet lowering the compute required versus existing modeling architectures.</p><h3>Bringing the Physical World to Life: Sora’s Mastery over 3D and Continuity</h3><p>3D space and object permanence is one of the key standouts in the demo’s by Sora. Through its training on a wide range of video data without adapting or preprocessing the videos, Sora learns to model the physical world with impressive accuracy as its able to consume the training data in its original form.</p><p>It can generate digital worlds and videos where objects and characters move and interact in three-dimensional space convincingly, maintaining coherence even when they are occluded or leave the frame.</p><h3><strong>Looking Ahead: The Future Implications of Sora</strong></h3><p>Sora sets a new standard for what’s possible in generative models. This approach, much is likely to inspire the open-source community to experiment with and advance the capabilities in visual modalities, fueling a new generation of generative models that push the boundaries of creativity and realism.</p><blockquote>The journey of Sora is just beginning, and as OpenAI put’s it “scaling video generation models is a promising path towards building general purpose simulators of the physical world”</blockquote><p>Sora’s approach, blending the latest in AI research with practical applications, signals a bright future for generative models. As these technologies continue to evolve, they promise to redefine our interactions with digital content, making the creation of high-fidelity, dynamic videos more accessible and versatile.</p><h4>Enjoyed This Story?</h4><p>Vincent Koc is a highly accomplished, commercially-focused technologist and futurist with a wealth of experience focused in data-driven and digital disciplines.</p><p><a href="https://medium.com/subscribe/@vkoc">Subscribe for free</a> to get notified when Vincent publishes a new story. Or follow him on <a href="https://www.linkedin.com/in/koconder/">LinkedIn</a> and <a href="https://twitter.com/koconder">X</a>.</p><p><a href="https://medium.com/subscribe/@vkoc">Get an email whenever Vincent Koc publishes.</a></p><p><em>Unless otherwise noted, all images are by the author</em></p><hr><p><a href="https://medium.com/data-science/explaining-openai-soras-spacetime-patches-the-key-ingredient-e14e0703ec5b">Explaining OpenAI Sora’s Spacetime Patches: The Key Ingredient</a> was originally published in <a href="https://medium.com/data-science">TDS Archive</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>
</article>
<article>
<h1>Generative AI Design Patterns: A Comprehensive Guide</h1>
<p>Vincent Koc — Tue, 13 Feb 2024 23:19:59 GMT</p>
<h4>Reference architecture patterns and mental models for working with Large Language Models (LLM’s)</h4><figure></figure><p><em>Note: When I initially published this article back in February it was an early thought experiment. Since then I have started working on a book for “Generative AI Design Patterns” with a major publisher. Please follow me to keep updated on updates to my patterns an ideas in this space.</em></p><h4>The Need For AI Patterns</h4><p>We all anchor to some tried and tested methods, approaches and patterns when building something new. This statement is very true for those in software engineering, however for generative AI and artificial intelligence itself this may not be the case. With emerging technologies such as generative AI we lack well documented patterns to ground our solution's.</p><p>Here I share a handful of approaches and patterns for generative AI, based on my evaluation of countless production implementations of LLM’s in production. The goal of these patterns is to help mitigate and overcome some of the challenges with generative AI implementations such as cost, latency and hallucinations.</p><h4>List of Patterns</h4><ol><li><a href="#82c6">Layered Caching Strategy Leading To Fine-Tuning</a></li><li><a href="#fc13">Multiplexing AI Agents For A Panel Of Experts</a></li><li><a href="#a0b9">Fine-Tuning LLM’s For Multiple Tasks</a></li><li><a href="#19ee">Blending Rules Based & Generative</a></li><li><a href="#1b7b">Utilizing Knowledge Graphs with LLM’s</a></li><li><a href="#1575">Swarm Of Generative AI Agents</a></li><li><a href="#682f">Modular Monolith LLM Approach With Composability</a></li><li><a href="#07cb">Approach To Memory Cognition For LLM’s</a></li><li><a href="#ee0a">Red & Blue Team Dual-Model Evaluation</a></li></ol><h4>1) Layered Caching Strategy Leading To Fine-Tuning</h4><figure></figure><p>Here we are solving for a combination of factors from cost, redundancy and training data when introducing a caching strategy and service to our large language models.</p><p>By caching these initial results, the system can serve up answers more rapidly on subsequent queries, enhancing efficiency. The twist comes with the fine-tuning layer once we have sufficient data, where feedback from these early interactions is used to refine a more specialized model.</p><p>The specialized model not only streamlines the process but also tailors the AI’s expertise to specific tasks, making it highly effective in environments where precision and adaptability are paramount, like customer service or personalized content creation.</p><p>For getting started there are pre-built services such as <a href="https://github.com/zilliztech/GPTCache">GPTCache</a> or roll your own with common caching databases such as <a href="https://redis.io/">Redis</a>, <a href="https://cassandra.apache.org/_/index.html">Apache Cassandra</a>, <a href="https://memcached.org/">Memcache</a>d. Be sure you monitor and measure your latency as you add additional services to the mix.</p><h4>2) Multiplexing AI Agents For A Panel Of Experts</h4><figure></figure><p>Imagine an ecosystem where multiple generative AI models orientated to a specific task (“agents”), each a specialist within its domain, work in parallel to address a query. This <em>multiplexing</em> strategy enables a diverse set of responses, which are then integrated to provide a comprehensive answer.</p><p>This setup is ideal for complex problem-solving scenarios where different aspects of a problem require different expertise, much like a team of experts each tackling a facet of a larger issue.</p><p>A larger model such as a GPT-4 is used to understand context and break this down into specific tasks or information requests which are passed to smaller agents. Agents could be smaller language models such as <a href="https://www.microsoft.com/en-us/research/blog/phi-2-the-surprising-power-of-small-language-models/">Phi-2</a> or <a href="https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v0.1">TinyLlama</a> that have been trained on specific tasks, access to specific tools or generalized models such as GPT, Llama with specific personality, context prompts and function calls.</p><h4>3) Fine-Tuning LLM’s For Multiple Tasks</h4><figure></figure><p>Here we fine-tune a large language model on multiple tasks simultaneously instead of a single task. It’s an approach that promotes a robust transfer of knowledge and skills across different domains, enhancing the model’s versatility.</p><p>This multi-task learning is especially useful for platforms that need to handle a variety of tasks with a high degree of competence, such as virtual assistants or AI-powered research tools. This could potentially simplify workflows for training and testing for a complex domain.</p><p>Some resources and packages for training LLM’s include <a href="https://github.com/microsoft/DeepSpeed">DeepSpeed</a>, and the training functions on <a href="https://huggingface.co/docs/transformers/training">Hugging Face’s Transformer library</a>.</p><h4>4) Blending Rules Based & Generative</h4><figure></figure><p>A number of existing business systems and organizational applications are still somewhat rules based. By fusing the generative with the structured precision of rule-based logic, this pattern aims to produce solutions that is both creative yet compliant.</p><p>It’s a powerful strategy for industries where outputs must adhere to stringent standards or regulations, ensuring the AI remains within the bounds of desired parameters while still being able to innovate and engage. A good example of this is generating intents and message flows for a phone call IVR system or traditional (<em>non-llm based</em>) chat bots which is rules based.</p><h4>5) Utilizing Knowledge Graphs with LLM’s</h4><figure></figure><p>Integrating knowledge graphs with generative AI models gives them a fact orientated super power, allowing for outputs that are not only contextually aware but also more factually correct.</p><p>This approach is crucial for applications where truth and accuracy are non-negotiable, such as in educational content creation, medical advice, or any field where misinformation could have serious consequences.</p><p>Knowledge graphs and graph ontologies (<em>set of concepts for a graph</em>) allow for complex topics or organizational problems to be broken into a structured format to help ground a large language model with deep context. You can also use a language model to generate the ontologies in a format such as JSON or RDF, <a href="https://gist.github.com/koconder/c37806ecc2e0a6d1ed3cdfbe4951b199">example prompt I created you can use</a>.</p><p>Services you can use for knowledge graphs include graph database services such as <a href="https://arangodb.com/">ArangoDB</a>, <a href="https://aws.amazon.com/neptune/">Amazon Neptune</a>, <a href="https://azure.microsoft.com/en-us/products/cosmos-db">Azure Cosmos DB</a> and <a href="https://neo4j.com/">Neo4j</a>. There are also wider datasets and services for accessing broader knowledge graphs including <a href="https://cloud.google.com/enterprise-knowledge-graph/docs/search-api">Google Enterprise Knowledge Graph API</a>, <a href="https://github.com/pykeen/pykeen?tab=readme-ov-file#datasets">PyKEEN Datasets</a>, and <a href="https://cloud.google.com/enterprise-knowledge-graph/docs/search-api">Wikidata</a>.</p><h4>6) Swarm Of AI Agents</h4><figure></figure><p>Drawing inspiration from natural swarms and heards, this model employs a multitude of AI agents that collectively tackle a problem, each contributing a unique perspective.</p><p>The resulting aggregated output reflects a form of collective intelligence, surpassing what any individual agent could achieve. This pattern is particularly advantageous in scenarios that require a breadth of creative solutions or when navigating complex datasets.</p><p>An example of this could be <a href="https://www.fieldstudy.ai/">reviewing a research paper from a multiple “experts” point of view</a>, or assessing customer interactions for many use-cases at once from fraud to offers. We take these collective “agents” and combine all their inputs together. For high volume swarm’s you can look at deploying messaging services such as <a href="https://kafka.apache.org/">Apache Kafka</a> to handle the messages between the agents and services.</p><h4>7) Modular Monolith LLM Approach With Composability</h4><figure></figure><p>This design champions adaptability, featuring a modular AI system that can dynamically reconfigure itself for optimal task performance. It’s akin to having a Swiss Army knife, where each module can be selected and activated as needed, making it highly effective for businesses that require tailor-made solutions for varying customer interactions or product needs.</p><p>You can deploy the use of various autonomous agent frameworks and architectures to develop each of your agents and their tools. Example frameworks include <a href="https://github.com/joaomdmoura/crewAI">CrewAI</a>, <a href="https://www.langchain.com/">Langchain</a>, <a href="https://www.microsoft.com/en-us/research/project/autogen/">Microsoft Autogen</a> and <a href="https://superagi.com/">SuperAGI</a>.</p><p>For a sales modular monolith this could be agents focused on prospecting, one handling bookings, one focused on generating messaging, and another updating databases. In future as specific services become available from specialized AI companies, you can swap out a module for an external or 3rd party service for a given set of tasks or domain specific problems.</p><h4>8) Approach To Memory Cognition For LLM’s</h4><figure></figure><p>This approach introduces an element of human-like memory to AI, allowing models to recall and build upon previous interactions for more nuanced responses.</p><p>It’s particularly useful for ongoing conversations or learning scenarios, as the AI develops a more profound understanding over time, much like a dedicated personal assistant or an adaptive learning platform. Memory cognition approaches can be developed through summation and storing key events and discussions into a vector database over time.</p><p>To keep compute of summaries low, you can leverage summation through smaller NLP libraries such as <a href="https://spacy.io/">spaCy</a>, or <a href="https://huggingface.co/docs/transformers/model_doc/bart">BART language models</a> if dealing with considerable volumes. Databases used are vector based and retrieval during prompt stage to check the short-term memory uses a similarity search to locate key “facts”. For those interested on a working solution there is an open-sourced solution following a similar pattern called <a href="https://memgpt.readme.io/docs/index">MemGPT</a>.</p><h4>9) Red & Blue Team Dual-Model Evaluation</h4><figure></figure><p>In the Red and Blue team evaluation model, one AI generates content while another critically evaluates it, akin to a rigorous peer-review process. This dual-model setup is excellent for quality control, making it highly applicable in content generation platforms where credibility and accuracy are vital, such as news aggregation or educational material production.</p><p>This approach can be used to replace parts of human feedback for complex tasks with a fine-tuned model to mimic the human review process and refine the results for evaluating complex language scenarios and outputs.</p><h4><strong>Takeaways</strong></h4><p>These design patterns for generative AI are more than mere templates; but the frameworks upon which the intelligent systems of tomorrow will grow. As we continue to explore and innovate, it’s clear that the architecture we choose will define not just the capabilities but the very identity of the AI we create.</p><p>By no means this list is final, we will see this space develop as the patterns and use cases for generative AI expands. <em>This write-up was inspired by the </em><a href="https://tomtunguz.com/ai-design-patterns/"><em>AI design patterns</em></a><em> published by Tomasz Tunguz.</em></p><h4>Enjoyed This Story?</h4><p>Vincent Koc is a highly accomplished, commercially-focused technologist and futurist with a wealth of experience focused in data-driven and digital disciplines.</p><p><a href="https://medium.com/subscribe/@vkoc">Subscribe for free</a> to get notified when Vincent publishes a new story. Or follow him on <a href="https://www.linkedin.com/in/koconder/">LinkedIn</a> and <a href="https://twitter.com/koconder">X</a>.</p><p><a href="https://medium.com/subscribe/@vkoc">Get an email whenever Vincent Koc publishes.</a></p><p><em>Unless otherwise noted, all images are by the author</em></p><hr><p><a href="https://medium.com/data-science/generative-ai-design-patterns-a-comprehensive-guide-41425a40d7d0">Generative AI Design Patterns: A Comprehensive Guide</a> was originally published in <a href="https://medium.com/data-science">TDS Archive</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>
</article>
<article>
<h1>Navigating the AI Landscape of 2024: Trends, Predictions, and Possibilities</h1>
<p>Vincent Koc — Tue, 02 Jan 2024 21:02:08 GMT</p>
<h4>2024 beckons with a promise of innovation — a year where AI and technology converge to redraw the maps of possibility.</h4><figure></figure><h4>Introduction</h4><p>Welcome to a journey through the possibilities that 2024 holds for AI and technology. Here, each prediction is a potential window into a future filled with innovation, change and more importantly opportunity similar to the industrial revolution of the 1950’s. The 50’s witnessed the rise of digital computing, reshaping industries and societal norms. Today, artificial intelligence plays a similar role, forging the next industrial revolution.</p><p>Much like the the post-war technological boom, we are navigating 2024 with industries transforming, new skills in demand, and significant ethical considerations being raised. The views expressed here are all my own, and based on my personal, commercial, and academic experiences combined.</p><h4>Table Of Contents — All Predictions & Trends:</h4><ol><li><a href="#eb51">Generative AI Leaps From Hype To Center Stage</a></li><li><a href="#1b8b">Expect To See The Next Generation Of Neural Networks</a></li><li><a href="#ddd4">Every Data Platform Launches Their Vector Data Feature</a></li><li><a href="#76f7">Rush To Control The Hardware And Platform Supply Chain</a></li><li><a href="#3528">2024 Is The Year Of The AI Wearables & XR</a></li><li><a href="#4bee">AI Agents Will Start Communicating With Other AI Agents</a></li><li><a href="#e8f1">Generative AI Modalities Will Expand</a></li><li><a href="#cdc7">Consumers & Regulators Push For More Democratized AI</a></li><li><a href="#1611">The New Era of AI-Infused Marketing Strategies</a></li><li><a href="#a195">The “Garbage In, Garbage Out” Dilemma Intensifies</a></li><li><a href="#b80d">Purpose Built Smaller Foundational Models Commonplace</a></li><li><a href="#be8a">The Dawn Of AI Marketplaces For Agents</a></li><li><a href="#78a7">AI Products Will Go Beyond SaaS Model</a></li><li><a href="#80d9">BYO AI Movement Pushes Need For Secure Digital Identities</a></li><li><a href="#dab5">Conclusion & Key Takeaways</a></li></ol><figure></figure><h4>1) Generative AI Leaps From Hype To Center Stage</h4><p>Poised to redefine tech’s ecosystem, generative AI shifts from a hype-cycle balloon that’s “going to burst” to a core strategy and more of a sure thing, just as the <a href="https://edition.cnn.com/travel/article/flying-whales-airships-hnk-spc-intl/index.html">airships</a> are making a comeback.</p><p>For large enterprises it’s a paradigm shift in how they approach problem-solving and innovation, as they move from experimenting to adopting with generative AI. This technological course correction is akin to the transformative wave brought about by the early adoption of cloud technologies, suggesting a similar, if not greater, impact on the tech ecosystem.</p><blockquote><em>97% of Business owners already believe that generative AI tools such as ChatGPT will have a positive impact to their business (</em><a href="https://www.forbes.com/advisor/in/business/ai-statistics/"><em>Forbes</em></a><em>).</em></blockquote><p>The landscape of 2024, therefore, promises to be one where generative AI is not just a buzzword but a critical driver of technological advancement and business transformation. Where AI strategy is <em>strategy</em>.</p><figure></figure><h4>2) The Next Generation Of Neural Networks Starts Emerging</h4><p>With <a href="https://arxiv.org/abs/2311.00871">doubts forming</a> on the general cognitive abilities of transformer architectures powering LLMs and the race for AGI has fuelled the research community searching for the next best thing. We have seen some recent advancements as well as developments in existing technologies:</p><ul><li>SSM’s such as <a href="https://arxiv.org/abs/2312.00752">Mamba</a>; a model that excels in linear-time sequence modeling with <em>selective state spaces</em>. An alternative to transformer architecture that runs large language models today. It represents a leap forward in how AI processes and understands sequences, a fundamental aspect of human cognition.</li><li><a href="https://www.turing.ac.uk/research/interest-groups/neuro-symbolic-ai">Neuro-symbolic AI</a> blending the best of neural network’s learning capabilities with the precision of symbolic AI. This hybrid approach promises a more nuanced and sophisticated understanding of complex problems, bridging the gap between human-like reasoning and machine efficiency.</li><li>and lastly <a href="https://research.ibm.com/blog/what-is-alignment-ai">AI alignment</a> with self correcting models. The aim is to create models that can adapt and correct themselves without constant human intervention, moving closer to a form of generative independently and responsibly without large scale hallucination risks.</li></ul><p>These new model architectures are painting a future where AI’s capabilities could rival human cognition and the limitations of generative AI today. Research is early however expect to see someone take a leap of faith and break ground in this space.</p><figure></figure><h4>3) Every Data Platform Launches Their Vector Data Solution</h4><p>With the reliance on vector databases for generative AI, we will see all the key data platform players to bring to market their flavour for solving this solution. Although vector databases are not a new concept, they are not traditionally part of the “modern data stack” and have been in the past used for search engines and other types of machine learning.</p><p>Vector databases are a key requirement for more complex use cases of generative AI such as conversational memory, searching your documents (RAG), and also multi-modal solutions such as indexing images. It’s no surprise in mid-2023 we saw an explosion in demand following multi-modal models becoming more accessible.</p><figure><figcaption>Demand for Vector Databases following OpenAI GPT-4 release in March 2023 — Google Trends (Worldwide)</figcaption></figure><p>With Databricks recently releasing <a href="https://www.databricks.com/blog/introducing-databricks-vector-search-public-preview">their vector data solution</a> we are likely to expect key modern data platform players such as Snowflake to follow this trend in 2024. Possibly other adjacent technology providers are also going to jump on the trend by providing various features and services. Nearly all database technologies will start calling themselves “vector stores” in 2024.</p><p>Alongside this adoption shift, we are likely to see skills expectation with senior data and software engineers to get more comfortable with concepts around vector indexes, semantic search solutions, embeddings and possibly even algorithms such as BM25.</p><p>To stretch we might see the acquisition or further funding rounds for some of the initial modern vector search solutions adopted by the Generative AI community such as <a href="https://www.trychroma.com/">Chroma</a>, <a href="https://weaviate.io/">Weaveate</a>, <a href="https://www.pinecone.io/">Pinecone</a>, and <a href="https://qdrant.tech/">Qdrant</a>.</p><figure><figcaption>Github “stars” for open source vector database repositories. Growth aligns with GPT3.5 release — Source: Star-History & Github</figcaption></figure><p>In the open-source world of vector stores Milvus is currently the crowd favourite and established itself as the enterprise choice with its managed services, but recently Qdrant has made some exceptional headway with near exponential growth in 2023.</p><p>If you have some spare time read the four part <a href="https://thedataquarry.com/posts/vector-db-1/">detailed vector database breakdown</a> given by fellow AI engineer Prashanth Rao for a great in-depth understanding of the capabilities and various providers of vector database solutions.</p><figure></figure><h4>4) Rush To Control The Hardware And Platform Supply Chain</h4><p>With the dominance of artificial intelligence there is an increasing importance to better control the supply chain end-to-end to hold the keys to innovation.</p><p>NVIDIA, for instance, a leader in chip manufacturing which has been a clear winner in the AI race. The question on everyone’s mind is whether NVIDIA will expand its horizons into the cloud computing realm, leveraging its hardware expertise to offer integrated AI cloud services. NVIDIA already has a cloud streaming service called <a href="https://www.nvidia.com/geforce-now/">Geforce Now</a> offering high end graphics processing on-demand. Such a move could redefine the competitive landscape, offering NVIDIA a more direct influence over AI’s developmental trajectory.</p><figure><figcaption>Nvidia Share Price, Last 5 Years (NASD:NVDA) — Source: Nvidia Investor Relations</figcaption></figure><p>On the other side of the spectrum are cloud AI providers like Amazon and emerging AI players like Anthropic and Mistral. Their current reliance on external hardware sources for AI operations raises a crucial question: Will they follow in the footsteps of OpenAI, <a href="https://www.wired.com/story/openai-buy-ai-chips-startup-sam-altman/">which has begun sourcing its own chips</a> and Google with <a href="https://cloud.google.com/tpu">TPU’s</a> and <a href="https://coral.ai/">Coral AI</a>? This strategy could signify a shift towards self-reliance and customization in AI hardware, potentially leading to more tailored and efficient AI solutions for the top AI providers.</p><figure><figcaption>Google Coral — Local AI Development Boards and Hardware — Credit: Coral AI</figcaption></figure><p>As everyone in the underlying hardware layer of AI will be rushing to further control their supply chain. Will chip manufacturer NVIDIA move to cloud? Will cloud AI providers such as Amazon or the likes of Antrophic follow OpenAI in sourcing their own chips? Will mobile chip makers like Qualcomm be the winners as they power the new AI mobile devices and wearables?</p><p>The trend towards vertical integration of AI underscores a larger narrative: control over AI hardware is becoming synonymous with control over the future of tech. We will expect to see new players muscle into the purpose built AI chips to serve cloud players, countries rushing to fuel microprocessor development and finally OpenAI getting into the chip game in early 2024.</p><figure></figure><h4>5) 2024 Is The Year Of The AI Wearables & XR</h4><p>This year, we’re witnessing an intimate revolution in how we interact with technology, with the world of AI and wearables fusing alongside extended reality (XR) devices. These devices are not just gadgets; they’re extensions of our digital selves, blending seamlessly into our daily lives.</p><figure><figcaption>Humane AI Pin — Source: Humane</figcaption></figure><p>Humane <a href="https://hu.ma.ne/aipin">AI’s Pin</a> (powered by Qualcomm Snapdragon processors) and <a href="https://mytab.ai/">Tab</a> are redefining the wearable landscape. These devices offer a glimpse into a future where wearables are no longer just about tracking health metrics or receiving notifications. They are about enhancing human interactions, offering real-time AI assistance, and providing an augmented experience of the world around us.</p><figure><figcaption>Rewind AI Pendant — Source: Rewind</figcaption></figure><p>Rewind, another breakthrough product, is revolutionizing how we capture and relive our memories. Imagine a device that not only records moments as we experience them but also allows us to revisit, understand, reflect on our memories.</p><p>But it’s not just the new players making waves. Tech giants like Apple and <a href="https://www.businessinsider.com/were-getting-closer-to-openais-first-ai-device-2023-12">OpenAI are stepping into the wearable arena</a>, promising to bring their colossal innovation capabilities. Apple, with its track record in creating trendsetting devices, is highly likely to introduce wearables that integrate seamlessly with its ecosystem within or alongside their XR <a href="https://www.apple.com/apple-vision-pro/">Apple Vision Pro</a>, offering unparalleled user experiences.</p><p>OpenAI, on the other hand, could leverage its expertise in AI to introduce wearables that bring advanced AI functionalities right to our fingertips (<em>or wrists</em>). The potential for AI wearables that interact with generative AI models like ChatGPT is particularly exciting, offering a blend of convenience and intelligence unlike anything we’ve seen before.</p><p>Meta is another key player to watch. Their foray into VR wearables has already shown us the potential for immersive digital experiences. As they continue to innovate in this space, we can expect wearables that not only augment reality but create entirely new realms for us to explore and interact with.</p><p>In 2024, AI wearables are set to move beyond mere technology. They are becoming personal companions, digital assistants, and gateways to new realities. They represent a shift towards a more intimate and interactive relationship with technology, one where our digital and physical worlds intertwine seamlessly. This year, we’re not just wearing technology; we’re living it.</p><figure></figure><h4>6) AI Agents Will Start Communicating With Other AI Agents</h4><p>Envision an ecosystem where specialized agents, each with its unique expertise and knowledge base, interact and collaborate. With the development of AI “agents” we will continue to see the growth in this space heading into 2024.</p><figure><figcaption>AI Agents Market Map Dec 23 — Credit: Olivia Moore</figcaption></figure><p>We will see the advent of agents going beyond supporting individual needs like writing my email, solving a customer support issue or ordering my groceries to an ecosystem where agents will start to interact with other agents. This paradigm shift revolves closely with the concept of data products, where enterprises will have the opportunity to monetize their agents in the same way they did with their models and datasets. This will foster a new ecosystem of interconnected, intelligent agents.</p><a href="https://medium.com/media/3c77ebe5860fbbfaea4b3ffd7f752a2c/href">https://medium.com/media/3c77ebe5860fbbfaea4b3ffd7f752a2c/href</a></iframe><p>We are seeing robotics and humanoids from companies such as <a href="https://bostondynamics.com/">Boston Dynamics</a> and <a href="https://www.tesla.com/AI">Tesla</a> having to look at solving this problem as various robots need to co-exist and locally communicate the decide how to carry out a task.</p><p>Companies with strongholds of data within given verticals like Bloomberg (<em>finance</em>) and LexisNexis (<em>law</em>) are poised to be potential frontrunners in this domain. Bloomberg, with its stronghold in finance data, could introduce sophisticated finance agents and have already <a href="https://www.bloomberg.com/company/press/bloomberggpt-50-billion-parameter-llm-tuned-finance/">started on their own LLMs</a>, while LexisNexis could leverage its vast legal information repository to develop legal agents. These agents, powered by their respective deep moats of data, would not only serve their direct users but also act as invaluable resources for other enterprises and systems to power a new digital workforce.</p><p>Expect to start seeing new agent solutions beyond digital workforce to agent orchestration, management, monitoring as well as players in the digital robotic process automation space such as <a href="https://www.uipath.com/">UiPath</a> as well as humanoid advancements for manufacturing and plant work start to play into this space with their existing experience of automating and robotics systems at scale.</p><figure></figure><h4>7) Generative AI Modalities Will Expand</h4><p>Going beyond the text, code, image, video and audio to new more immersive modalities and senses such as 3D, genomics, smell, taste and will start to come into the market in early forms.</p><figure><figcaption>Generative spatial AI to generate new town layouts — <a href="https://www.generativespatialai.com/">https://www.generativespatialai.com/</a></figcaption></figure><p>Generative AI is set to grow out of its current boundaries of text, code, image, video, and audio. We will embrace more immersive modalities that help science with protein structures and materials or engage additional senses like 3D, smell, and taste. These novel modalities are expected to emerge in their early forms, signaling the next wave of Generative AI use cases.</p><a href="https://medium.com/media/f32d3fb78dfbce27b157affe8aa5cb8b/href">https://medium.com/media/f32d3fb78dfbce27b157affe8aa5cb8b/href</a></iframe><p>The rise of autonomous AI agents and multi-modal models, coupled with advancements in wearables and extended reality (XR), is paving the way for a more immersive and interactive experiences for consumers.</p><blockquote>Imagine your dreams turned into a VR world, every game having a unique world tailored to your perspective?</blockquote><p>With <a href="https://repo-sam.inria.fr/fungraph/3d-gaussian-splatting/">recent 3D modelling technologies</a> (<em>Gaussian splatting</em>) were videos can be converted into 3d virtual realities we will see this grow to new heights with generative technologies.</p><figure><figcaption>A-Lab Berkeley, Robot Testing New Materials — Credit: Marilyn Sargent/Berkeley Lab</figcaption></figure><p>The biggest impact will come from material sciences and genomics. <a href="https://deepmind.google/discover/blog/millions-of-new-materials-discovered-with-deep-learning/">GNoME model</a> developed by Google Deepmind has been used already to make breakthroughs in material sciences, discovering new crystal structures driving better batteries to more efficient computers.</p><p>These fields such as sciences are where the most profound research advancements are likely to occur.</p><figure></figure><h4>8) Consumers & Regulators Push For More Democratized <strong>AI</strong></h4><p>The will be a continued push towards greater accessibility and inclusivity with AI, but challenges remain due to the complexities and costs of developing foundational AI models. This dichotomy sets the stage for increasing public demands for transparency and ethical oversight in AI.</p><figure><figcaption>EU AI Act — Proposed Levels of Risk — Source: EY</figcaption></figure><p>Concerns over privacy and the societal impact of AI are driving consumers and regulatory bodies, especially in regions like the EU where GDPR was the catalyst to modern data privacy laws, to advocate for more stringent governance of AI. This year, we expect to see strides in establishing frameworks for auditing AI models, standardizing accuracy, and introducing “report cards” for AI systems but there is still a long way to go.</p><figure><figcaption>Do Foundation Model Providers Comply with the Draft EU AI Act? — Source: Stanford CRFN</figcaption></figure><p>The focus on risk management in AI will become more pronounced. Enterprises will navigate a landscape where AI is not only a tool for innovation but also under close regulatory scrutiny. Unified frameworks and standards will emerge, guiding businesses in responsible AI adoption and ensuring that AI’s integration into mainstream society is safe and aligned with public welfare.</p><figure></figure><h4>9) The New Era of AI-Infused Marketing Strategies</h4><p>The marketing domain, traditionally commanding a lion’s share of enterprise budgets, is now navigating through a transformative landscape. The catalyst? The rise of chat-based tools like ChatGPT. These innovations are potentially leading to a noticeable decline in traditional search volume, fundamentally altering how consumers engage with information.</p><h3>Greg Sterling 🇺🇦 on Twitter: "A report from BofA argues that Google's search market share is down ever so slightly and attributes that to ChatGPT. (I question that.) I think a more interesting metric to look at would be search frequency; are people conducting as many searches on Google as they used to? pic.twitter.com/S9iBXR4biP / Twitter"</h3><p>A report from BofA argues that Google's search market share is down ever so slightly and attributes that to ChatGPT. (I question that.) I think a more interesting metric to look at would be search frequency; are people conducting as many searches on Google as they used to? pic.twitter.com/S9iBXR4biP</p><p>In this evolving scenario, marketers find themselves at a crossroads. The ability to influence or monitor brand mentions in these AI-driven dialogues is still in its nascent stages. Consequently, there’s a growing trend towards <a href="https://funnel.io/blog/2024-marketing-predictions">adapting marketing strategies for a generative AI world</a>. This adaptation involves a strategic reliance on traditional media in the short term, leveraging its reach and impact to build and sustain brand presence.</p><p>Simultaneously, we are witnessing a significant shift in the technological landscape. The move from browser-based tools to on-device applications is gaining momentum. Leading this charge are innovations like Microsoft Co-Pilot, Google Bard on devices such as Android, and the anticipated launch of Apple’s own large language model (LLM) sometime in 2024. This transition indicates a paradigm shift from web-centric interactions to a more integrated, device-based AI experience.</p><figure><figcaption>New Microsoft Surface X expected late 2024 — Source: Microsoft</figcaption></figure><p>This shift extends beyond mere convenience; it represents a fundamental change in user interaction paradigms. As AI becomes more seamlessly integrated into devices, the distinction between online and offline interactions becomes increasingly blurred. Users are likely to interact with AI in more personal, context-aware environments, leading to a more organic and engaging user experience. For tech giants like Google, Microsoft, and Apple, already entrenched in the marketing services world, this represents an opportunity to redefine their offerings.</p><figure><figcaption>ChatGPT not knowing who I am — Source: Vincent Koc and OpenAI ChatGPT</figcaption></figure><p>We can anticipate the emergence of new “answer analytics” platforms and operating models in marketing to support <em>answer engine optimisation</em>. These tools will likely focus on understanding and leveraging the nuances of AI-driven interactions but potentially better leverage the training data to understand how the results might be portrayed for a given brand or product.</p><p>Digital marketeers will start to think more deeply about how they are indexed in these training datasets same as they once did with search engines.</p><figure><figcaption>Screenshot of Top Domains Crawled by Commoncrawl, Dataset used to train most Large Language Models — <a href="https://commoncrawl.github.io/cc-crawl-statistics/plots/domains.html">https://commoncrawl.github.io/cc-crawl-statistics/plots/domains.html</a></figcaption></figure><p>Moreover, the potential launch of ad-sponsored results or media measurement tools by platforms like OpenAI could introduce a new dimension in digital advertising. This development would not only offer new avenues for brand promotion but also challenge existing digital marketing strategies, prompting a reevaluation of metrics and ROI assessment methodologies.</p><p>As LLM’s migrate into devices, moving away from traditional web interfaces, the marketing landscape is poised for significant changes. Marketers must adapt to these shifts, leveraging both traditional media and emerging AI technologies, to effectively engage with their audiences in this new digital era. This dual approach, combining the impact of traditional media with the precision of AI-driven analytics, could very well be the key to success in the rapidly evolving marketing landscape of 2024.</p><figure></figure><h4>10) <strong>The “Garbage In, Garbage Out” Dilemma Intensifies</strong></h4><p>As organizations increasingly pivot towards leveraging generative AI models and developing their own fine-tuned solutions, the spotlight falls sharply on the quality of input data. The classic saying in data management circles of “garbage in, garbage out” is bubbled up again as data quality is now back on the table.</p><figure><figcaption>Trend for: Data Quality Topic — Source: Google Trends Worldwide</figcaption></figure><p>Organisations and leaders confront the harsh reality that high-quality, accurately labeled data is the cornerstone of effective AI deployment. The issue goes beyond the obvious availability of data; it’s about its relevance, accuracy, and the context it provides. Issues of bias and misguided training data can spell disaster for the output of a model.</p><figure><figcaption>Range of biases have been found in a range of training data used by Large Language Models. — Source: <a href="https://arxiv.org/abs/2309.00770">Bias and Fairness in Large Language Models: A Survey — arxiv</a></figcaption></figure><p>However, the challenges don’t stop there. Existing data and AI pipeline technologies, which once seemed adequate, are now being pushed to their limits. They are often found wanting in the face of the nuanced demands of advanced AI models. This gap between capability and requirement necessitates an evolution in data processing tools and methodologies.</p><blockquote>The <a href="https://www.precisely.com/resource-center/analystreports/lebow-report-2023">2023 Data Integrity Trends and Insights Report</a> uncovered poor data quality as a pervasive theme for organizations across the industry, with 70% of those with low levels of trust in their data pointing to data quality as the biggest challenge to making confident decisions.</blockquote><p>Furthermore, the pursuit of AI excellence comes with its own set of logistical hurdles. The intensive compute power required for these sophisticated AI models translates to a substantial demand for GPUs.</p><figure><figcaption>Meme depicting the “GPU shortage”</figcaption></figure><p>But this isn’t just about having the financial muscle to invest in hardware; it’s about the availability of these resources in the market. As more players enter the AI arena, the scramble for GPUs intensifies, leading to potential bottlenecks in AI development and deployment which adds further complexity to the ability for organisations to adapt to the normal in AI.</p><p>As 2024 unfolds, we witness a renewed focus on data quality and infrastructure enhancement, shaping the trajectory of AI development.</p><figure></figure><h4><strong>11) Purpose Built </strong>Small<strong>er Foundational Models Commonplace</strong></h4><p>2024 might very well be the year of the small foundational models. These specialized, purpose-built AI models are set to take center stage, outshining their generalized counterparts in efficiency and precision.</p><p>Organisations now have a number of options of using readily trained generalised large language models such as OpenAI GPT, Google Bard, Anthropic Claude [<em>RL model in chart b</em>elow] or venture into the world of building your own.</p><figure><figcaption>LLM development stages, pioneered by the InstructGPT paper, leading to ChatGPT. This figure is adapted from Chip Huyen’s post “<a href="https://huyenchip.com/2023/05/02/rlhf.html">RLHF: Reinforcement Learning from Human Feedback</a>” — Source: <a href="https://argilla.io/blog/argilla-for-llms/">Argilla</a></figcaption></figure><p>With options to fine-tune which is taking the base/foundational language model but teaching it new things (<em>as you would imagine fine tuning a car to go faster</em>) or go deep into creating your own foundational (base) models altogether.</p><figure><figcaption>Open AI GPT Custom Model for Enterprise — Source: OpenAI</figcaption></figure><p>With generalised models such as the GPT models with 100 billion parameters (<em>or “switches” in its model programming</em>) <a href="https://www.cnbc.com/2023/03/13/chatgpt-and-generative-ai-are-booming-but-at-a-very-expensive-price.html">would cost roughly upwards of $4million</a> to rebuild. OpenAI has also recently started offering a service for enterprise to “build your own GPT” for pricing starting from $2–3million.</p><p>Organizations that have either developed their own foundational models or fine-tuned existing ones to their specific use-cases are poised for success. This approach aligns with the 80:20 rule, where the <a href="https://www.fastcompany.com/90938411/smaller-language-models-generative-ai-chatbots">focus is on smaller, tailored models that cater to specific needs</a> rather than attempting to appeal to the masses with generalized solutions.</p><p>The true value for organizations lies in the ability to develop these purpose-built models for discrete tasks. Not only do these models offer higher accuracy and relevance, but they also present new monetization opportunities. In a world increasingly driven by specialized needs, these models become invaluable assets, offering solutions that are both effective and economically viable.</p><p>Moreover, the trend is shifting away from relying solely on large, general-purpose models as they are not quite perfect for every need. Many organizations have built solutions on top of these broad models, acting as “thin wrappers” that offer limited scope for customization and scalability. While these solutions may have been a stepping stone, they are unlikely to provide the long-term value that developing proprietary models can offer.</p><figure><figcaption>Generalised Model Availability and Quality Prediction — Source: <a href="https://www.sequoiacap.com/article/generative-ai-a-creative-new-world/">Sequioa Capital</a></figcaption></figure><p>The winners in this evolving landscape will be those who invest in developing their own models generalised or small foundational models to plug gaps in the generalised space. This strategy not only increases accuracy and effectiveness but also reduces cost overheads. Smaller models are not only cheaper to run but also quicker to adapt and easier to manage.</p><blockquote>This leads to the concept of “<strong>Total Cost of Modelling</strong>,” where the benefits of developing and maintaining these models outweigh the initial investment, offering a more sustainable and cost-effective approach in the long run.</blockquote><p>As we look towards 2024, it’s clear that the ability to create and leverage small foundational models will be a key differentiator in the competitive AI market. This shift marks a significant move towards more personalized, efficient, and economically sound AI solutions.</p><figure></figure><h4>12) The Dawn Of AI Marketplaces For Agents</h4><p>AI marketplaces are emerging as adaptive and responsive platforms, reshaping the way we think about technology transactions and interactions.</p><figure><figcaption>OpenAI GPT Store, Launching in 2024 — Source: OpenAI</figcaption></figure><p>Leading the charge, OpenAI is poised to unveil its much-anticipated “GPT marketplace” in early 2024, setting a new benchmark in the AI arena. This move is expected to open the floodgates, with other tech giants like Meta quickly following suit. We could witness an array of players, from established giants like Amazon, Apple and even Bytedance to emerging startups, diving into this space.</p><p>But the ripple effect of this revolution extends beyond traditional tech entities. With the growth of autonomous agents and the surge in wearable technology, there will be the Apple iPhone and App Store moment again. Developers will see this as the next gold rush of opportunity.</p><p>In this transformative phase, the AI marketplace phenomenon is expanding its reach from B2B to B2C sectors. We’re likely to see a diverse range of players trying their hand at this, each bringing unique value propositions to the table. From consumer-focused AI applications to enterprise-level solutions, the spectrum of offerings in these marketplaces will cater to a wide array of needs and aspirations.</p><figure></figure><h4>13) AI Products Will Go Beyond SaaS Model</h4><p>With the proliferation of AI marketplaces and tools, traditional pricing strategies are being re-evaluated, making way for innovative approaches that cater to the unique nature of AI services.</p><figure><figcaption>Usage Based Companies — Source: <a href="https://openviewpartners.com/usage-based-pricing/">Open Venture Partners</a></figcaption></figure><p>We are likely to witness a significant shift from the conventional app store pricing model to more dynamic, consumption based billing systems. These models, reminiscent of utility billing like phone lines, are poised to become more prevalent, aligning with the concept of renting AI agents. In this setup, customers would pay based on the extent and nature of their AI usage, offering a flexible and potentially more equitable pricing structure.</p><p>But the evolution doesn’t stop there. The increasing adoption of AI marketplaces and tools is also paving the way for varied pricing strategies and novel business models. This change is driven by the need to accommodate a wide range of AI applications and services, each with its own value proposition and usage patterns. This could see the trial of both the revenue share (App Store) or the royalties on usage (Spotify) model for developers and their AI services or agents on marketplaces.</p><p>Another emerging model could be performance-based pricing, where charges are aligned with the outcomes or results delivered by the AI tool. Such a model would be particularly appealing in sectors where AI’s impact can be quantitatively measured, like in marketing analytics, financial forecasting, or even creative industries.</p><figure><figcaption>Bundled vs Unbundled Strategies in Pricing — Source: <a href="https://notes.mtb.xyz/p/bundling-unbundling-and-timing">Matt Brown</a></figcaption></figure><p>Furthermore, as AI continues to penetrate various sectors, cross-industry partnerships could give rise to bundled services. These bundles could combine AI tools with traditional software services, offering a comprehensive package that addresses a wider array of business needs.</p><p>The onset of these new pricing models and strategies reflects a marketplace that is rapidly adapting to the unique challenges and opportunities presented by AI. As businesses and consumers alike become more familiar with AI capabilities, the demand for flexible, transparent, and value-aligned pricing models will likely intensify.</p><figure></figure><h4>14) BYO AI Movement Pushes Need For Secure Digital Identities</h4><p>Expanding modern and generative AI tools will lead the expansion of digital footprints necessitating secure, portable digital identities, where the challenge is to balance robust security with user accessibility. Users will expect a personalised experience where preferences, history and context will be key to using many AI services across the web.</p><p>Banks and e-government platforms are emerging as potential custodians of these single digital identities and personal preferences. This consolidation points towards a streamlined, more secure digital existence. But it’s not just about security; it’s about the seamless integration of our digital selves across various platforms.</p><p>“Bring Your Own AI” (BYO AI) ties directly into this. Imagine carrying your digital preferences, learning styles, and even shopping habits seamlessly from one digital interaction to another. This portability isn’t just convenient; it’s transformative. It allows for a level of personalization and efficiency previously unattainable. Wearable are also becoming integral to managing our digital identities. By constantly learning from our interactions, they evolve into personal data hubs that not only understand our preferences but anticipate our needs.</p><p>The integration of AI into work environments means that our digital preferences could automatically adjust settings in office applications, communication tools, and even physical workspaces. Imagine entering a meeting room where the lighting, temperature, and even digital displays are automatically tailored to your preferences.</p><p>However, this level of personalization and data integration raises questions about privacy and data usage. As these digital identities become more intricate and intertwined with AI, the potential for them to be leveraged for advertising digital experience providers for hyper-personalization is significant. This could lead to a new era of contextual advertising and consumer engagement, where promotions are not just targeted but deeply integrated into our digital personas.</p><p>This integration of identity with AI will redefine how we interact with technology, both in personal and professional spheres, leading to a more personalized, efficient, and connected existence. As we embrace this future, the importance of ethical considerations and privacy safeguards becomes more crucial than ever.</p><h4>Conclusion & Key Takeaways</h4><p>As we look towards 2024, the potential of AI and technology to reshape our world is undeniable. Each of these predictions offers a glimpse into a future where innovation, responsibility, and inclusivity go hand in hand.</p><p>Key takeaways:</p><ol><li><strong>Generative AI Emerges as a Core Technology Strategy</strong>: Marking a shift from hype to mainstream adoption across various sectors.</li><li><strong>Advancements in Neural Networks Edge Closer to AGI</strong>: New architectures like Mamba and neuro-symbolic AI enhance cognitive capabilities significantly.</li><li><strong>AI Wearables and Extended Reality (XR) Gain Prominence</strong>: Offering augmented human experiences and interactions in daily life.</li><li><strong>Interconnected AI Agents Create a New Ecosystem</strong>: Specialized AI agents communicating with each other revolutionize industry dynamics.</li><li><strong>Secure Digital Identities Become Crucial in the BYO AI Era</strong>: The rise of portable digital identities managed by banks and e-government platforms.</li></ol><p>Let’s embrace this journey with open minds and hearts, ready to be part of a future that’s not just happening but is ours to shape. Join the conversation, share your insights, and let’s collectively envision and build the world of 2024.</p><h4>Enjoyed This Story?</h4><p>Vincent Koc is a highly accomplished, commercially-focused technologist and entrepreneur a wealth of experience focused in data-driven and digital disciplines. Currently, Vincent serves as a data leader in Australia as well as a lecturer for artificial intelligence in the US.</p><p><a href="https://medium.com/subscribe/@vkoc">Subscribe for free</a> to get notified when Vincent publishes a new story. Or follow him on <a href="https://www.linkedin.com/in/koconder/">LinkedIn</a> and <a href="https://twitter.com/koconder">X</a> (formerly Twitter).</p><p><a href="https://medium.com/subscribe/@vkoc">Get an email whenever Vincent Koc publishes.</a></p><p><em>Unless otherwise noted, all images are by the author</em></p><hr><p><a href="https://medium.com/data-science/navigating-the-ai-landscape-of-2024-trends-predictions-and-possibilities-41e0ac83d68f">Navigating the AI Landscape of 2024: Trends, Predictions, and Possibilities</a> was originally published in <a href="https://medium.com/data-science">TDS Archive</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>
</article>
<article>
<h1>Decoding the Future: The Parallels Between 1945 and Generative AI today</h1>
<p>Vincent Koc — Tue, 17 Oct 2023 06:05:47 GMT</p>
<h4>What Vannevar Bush Reveals About the Era of Generative AI</h4><figure><figcaption>Vision of Vannevar Bush’s Memex depicted from “As We May Think” (Credit: OpenAI DALLE)</figcaption></figure><h3>Part of the “Decoding the Future” Series</h3><p>After delving into the groundbreaking ideas presented at <a href="https://sxswsydney.com/">SXSW Sydney</a> in our <a href="https://medium.com/@vkoc/decoding-the-future-highlights-from-sxsw-sydneys-opening-keynote-a68c89ed7162">last installment</a>, we now turn our attention to the past — to what pioneering thinkers like Vannevar Bush can teach us about whats next in Generative AI. Decoding the Future is a content series I launched exploring ideas and trends right here in the inaugural SXSW Sydney. Follow me on Medium to follow the conversation.</p><h3>Glimpse into Cybernetics</h3><p>Standing out among the luminaries on Monday’s sessions at SXSW was Distinguished <a href="https://researchers.anu.edu.au/researchers/bell-gx">Professor Genevieve Bell AO,</a> an anthropologist, and Director of the <a href="https://cybernetics.anu.edu.au/">School of Cybernetics at ANU</a>. Her thought-provoking talk guiding us through a journey into cybernetics urged us to examine not just technology but the intricate “systems” that birth them. This piece aims to extend Bell’s narrative, tying her reflections to the visionary concept of the <em>memex</em> and the next frontier — Generative AI.</p><figure><figcaption>Prof. Genevieve Bell on stage at SXSW Sydney</figcaption></figure><h3>The Birth of an Idea</h3><p>In 1945, Vannevar Bush, an American engineer and inventor, took to paper “<a href="https://www.theatlantic.com/magazine/archive/1945/07/as-we-may-think/303881/">As We May Think</a>” outlining the <em>memex </em>— a mechanical library capable of storing, correlating, and retrieving information. Though never realized, the memex’s principles continue to echo in today’s world of technology and artificial intelligence.</p><figure></figure><figure><figcaption>Vannevar Bush (Credit: OEM Defense US Gov — Public Domain) and the “As We May Think” (Credit: Time Inc, Life — Public Domain)</figcaption></figure><p>Bush summed it up very well in these few lines which could apply just as easily today to Generative AI technologies:</p><blockquote>Consider a future device … in which an individual stores all his books, records, and communications, and which is mechanized so that it may be consulted with exceeding speed and flexibility. It is an enlarged intimate supplement to his memory.</blockquote><p>He conceived the <em>memex</em> as a vast repository of human knowledge. Though we never saw it in material form, its concept reverberates through today’s technological landscape.</p><figure><figcaption>Machine and Human Playing Go (Credit: OpenAI DALLE)</figcaption></figure><h3>Crossroads between 1945 and Today — The Evolution from Memex to AI</h3><h4>AI as an <em>Index of Knowledge</em></h4><p>Bush aimed for a <em>memex</em> that could compile an index of human knowledge. Fast forward to today, Generative AI serves a similar purpose but leverages algorithms, not gears and levers.</p><h4>AI Augmenting Our Cognition</h4><p>Machine augmentation serves as an extension of human intellect. Generative AI realizes this dream, aiding us in creative thinking, problem-solving, and even emotional understanding. These systems are not mere tools; they are thought enhancers.</p><h3>What’s Next for AI and Technologists?</h3><h4>AI Tailoring Rich Experiences</h4><p>Going beyond personalisation; the <em>memex</em> was idealised as this customisable, adaptive system. The spirit of personalization is not just alive but thriving, adapting to individual needs like never before.</p><p>Imagine if the AI could not only provide a response but act on your behalf, aware of your data, current state and go beyond the current “human and text” interface. We will see more platform ecosystems take shape around AI, connecting to the world and data around us personally and start to shape this into a reality.</p><h4>Cybernetics and Why Systems Thinking Matters</h4><p>We must regard technology as a “system” that goes beyond the hardware, software and AI models. Genevieve Bell’s perspective is a call to arms: are we creating systems that are holistic, adaptable, and designed to outlive us?</p><p>The future promises astonishing technological leaps, but how many are designed with a legacy in mind like Bush? It’s not just about innovation; it’s about creating systems that stand the test of time.</p><h3>In Summary</h3><p>As technologists navigating the future with AI requires some visionary thinking and foresight but there are some practical steps we can take to navigate tomorrow:</p><ol><li><strong>Never Forget to Look Back </strong>— The problems of yesterday are the problems of today.</li><li><strong>Craft a legacy </strong>— The systems we are building today could out live us. Think about passing something on to continue on its journey of evolution.</li><li><strong>Rethink personalisation </strong>—Consider the modalities of human-machine relationships; what and how would you want that relationship to be and work.</li></ol><p>The modern school of cybernetics is alive and well, it’s now up to us to shape the future use of AI. As we tread forward, let’s continue to decode the future by echoing the past.</p><p>For those interested and based in Australia. The Australian National University is running a number of <a href="https://cybernetics.anu.edu.au/education/microlearning/#introducing-cybernetic-leadership">Cybernetics micro-credentials</a> which you can use to further your journey into this field.</p>
</article>
<article>
<h1>Decoding the Future: Highlights from SXSW Sydney’s Opening Keynote</h1>
<p>Vincent Koc — Mon, 16 Oct 2023 11:45:54 GMT</p>
<h4>A Pivotal Moment: What Generative AI and Tech Shifts Mean for Tomorrow</h4><figure><figcaption>Illustration depicting a old meets new world of people sharing ideas in Sydney, Australia. (Credit DALL-E and Firefly)</figcaption></figure><p>SXSW (<em>South by South West</em>) was at its first-ever event outside of Austin, Texas since its inception in 1987 to Sydney, Australia — a place that celebrates the oldest continuous culture in the world. As someone who’s deeply vested in data and technology, I couldn’t pass up the opportunity to attend. The opening keynote featuring Adam Spencer and Amy Webb was nothing short of an alternative thought into the future, a great way to kick off the opening keynote for the week long festival of ideas.</p><h3>Listen into SXSW Sydney</h3><p>Throughout this event, I’ll be curating my thoughts, scribbles, takeaways, and memorable one-liners into a series of articles. My aim is to extend the conversation beyond the walls of the conference, in line with SXSW’s mission of fostering idea exchange and community connection. With <a href="https://schedule.sxswsydney.com/">over 400 sessions and more than 800 speakers</a>, this SXSW aims tackles some of the most pressing issues we face today: the integration of AI in society, sustainability, generational change, wellbeing and mental health.</p><figure></figure><figure><figcaption>Amy Webb and Adam Spencer at SXSW Sydney Opening Keynote</figcaption></figure><h3>Futurism: More Than Just Trends</h3><p><a href="https://amywebb.io/">Amy Webb</a>, a quantitative futurist, author, and CEO of Future Today Institute took the stage to go beyond mere trends. “Focusing on one trend will make you lose sight of the big picture,” she said. Webb’s approach to futurism isn’t just creative; it’s grounded in game theory and economics. She argued for a nuanced understanding of trends, advocating for strategic foresight rather than reactionary planning. Looking beyond the macro and micro factors but the convergence of trends.</p><p>She regularly shares at SXSW Austin a talk that captivates the delegates with a glimpse into the future with the emerging tech trend report [<a href="https://www.youtube.com/watch?v=vMUpzxZB3-Y">You can see the 2023 talk earlier this year</a>]. Here were some of the outlooks she presented at Sydney:</p><h4>Key Outlooks</h4><ol><li><strong>The Generative AI Debate</strong>: Webb raised concerns about imprudent investment in Generative AI due to <em>FOMO</em> (Fear of Missing Out) by those working with technology. She cautioned against leaping before looking when it comes to AI adoption, a sentiment that resonates deeply with those of us in the data industry.</li><li><strong>Search Paradigms Are Shifting</strong>: The keynote touched upon the profound change in how we will interact with the internet. Soon, “<em>we won’t search the web; the web will search us</em>”. This shift has far-reaching implications, particularly when it comes to data ethics. Possibly marking the end of the internet we know today.</li><li><strong>Two Possible Futures</strong>: The speakers presented an optimistic and a pessimistic view of a “Post Generative AI” world, challenging us to consider the potential impacts of rapid technological advancements. Will we have agency over our own data, or could we dive into a dystopian future.</li><li><strong>The Role of AI is Assistive</strong>: They likened modern AI tools to calculators or word processors — tools designed to augment human capabilities. The question remains: How reliant should we become on these technologies? How do you strike a balance of utility vs creativity?</li><li><strong>The Future of Employment</strong>: Despite potential short-term job losses, Webb remains optimistic about long-term job creation. She highlighted the need for education systems to adapt to this ever-changing landscape.</li></ol><h3>Closing Thoughts</h3><p>As we navigate this new era of AI, data, and technology:</p><ul><li>Strategic foresight needs to be balanced with cautious, pragmatic decision-making.</li><li>Ethical and inclusive approaches are non-negotiable.</li><li>Change is the only constant, but it’s not necessarily a bad thing.</li></ul><p>SXSW Sydney will hopefully create a hotbed for ideas and solutions to challenges ahead. As we move forward, let’s not just aim to survive the future, but to shape it.</p>
</article>
</main></body></html>