,

Unveiling the Complexity of Guiding AI Systems

Artificial Intelligence emerges as a potent and dynamic technology with vast potential for evolution and self-learning. However, the endeavor to control the learning process of AI systems proves to be a complex feat, necessitating the imposition of boundaries and constraints. Despite efforts to restrict an AI system’s access to information, interaction parameters, and core functions, guaranteed adherence to prescribed boundaries remains elusive, compounded by the inherently subjective nature of these limitations (often we have no idea of what we don’t want the system to learn).

In the domain of AI, fast decision-making holds pivotal significance, rooted in the principle of objectivity. The objective function, a cornerstone of AI systems, embodies the property we seek for the system to possess, typically oriented towards minimizing prediction errors. While the objective is predetermined, the pathway to its achievement remains unconstrained, akin to a game where the objective is clear, but the means of attainment are flexible within certain bounds.

Illustrating this concept is a simple game where a bucket stands 2 meters away from each player, and the objective is to throw a ball into the bucket as many times as possible within a fixed time frame. If successful, the player obtains a new ball, but if the ball misses the bucket, the player can only retrieve it from outside a 2-meter radius around the bucket. This scenario showcases how individuals, in pursuit of the objective, may devise imaginative strategies, resulting in unintended side effects. This demonstrates that even with a clear objective, individuals may take actions that appear “negative” but serve to help them achieve the objective from their perspective. The culmination of these individual actions may often lead to objectives far different from the original, altering the intended behavior (and overall objective) of the game.

Analogous to engaging in the above-described game with peers, aligning objectives with intended behavior necessitates the introduction of additional constraints to limit individual flexibility. Similarly, the fundamentally incomplete nature of the objective function in AI systems does not prevent them from potentially altering their core behavior. Consequently, the perpetual endeavor to define new rules reflects a quest for heightened control and safety in these systems.

In conclusion, the design of AI systems requires a paramount focus on safety and control. Delving into the complexities of building safe and controllable AI systems unveils the intricate yet pivotal considerations that underpin the development of this transformative technology. For a comprehensive understanding of this topic, further insights can be gleaned from my book “Towards Sustainable Artificial Intelligence”. Alternatively, reach out for a chat.