Executive Summary
Avoid using callable instance attributes that call back into the object that contains them. If you must, make the object uncopyable by overriding __copy__
and __deepcopy__
and raising an error.
Reason: in Python,foo.method
is permanently bound to foo
. If bar
is another instance of the same class, and we do bar.dynamic_method = foo.method
, bar.dynamic_method()
operates on foo
, not on bar
. This may cause interesting bugs.
Simple Example
Suppose we have a class that prints introductions in different languages. We also add functionality to introduce in default language, as follows:
class Foo: def __init__(self, name, default_language): self.name = name self.introduce = (self.introduce_in_english if default_language == "English" else self.introduce_in_french) def introduce_in_english(self): print(f"I am {self.name}") def introduce_in_french(self): print(f"Je m'apelle {self.name}") foo = Foo("FOO", "English") foo.introduce() # I am FOO foo.introduce_in_french() # Je m'apelle FOO bar = Foo("BAR", "French") bar.introduce = foo.introduce # (BAD IDEA) foo.introduce_in_english() # I am BAR bar.introduce_in_french() # Je m'apelle BAR bar.introduce() # I am FOO <--- BUG!
If we use copy()
, the copying happens implicitly, without the user realizing the averse effects. Given the same definition of `Foo` as above:
from copy import copy foo = Foo("FOO", "English") bar = copy(foo) bar.name = "BAR" bar.introduce() # I am FOO <--- BUG!
Fun fact: if we use deepcopy()
instead of copy()
, the bug disappears. The explanation is rather subtle, we address it in the next post.
Solution for the simple example
The solution is to avoid having callable attributes that are bound to the instance that contains them. Other callables are fine. This may lead to somewhat longer code, but it is free of the copying bug. For example, we can do:
def introduce_in_english(name): print(f"I am {name}") def introduce_in_french(name): print(f"Je m'apelle {name}") class Foo: def __init__(self, name, default_language): self.name = name self.greeter = (introduce_in_english if default_language == "English" else introduce_in_french) def introduce(self): self.greeter(self.name) def introduce_in_english(self): introduce_in_english(self.name) # call global function def introduce_in_french(self): introduce_in_french(self.name) # call global function
We replace dynamic methods with dynamic free-standing functions, and it avoids the bug. There are lots of other ways to accomplish the same thing, but the key point is to avoid storing a method, or another callable that is bound to the object itself, in an object attribute.
Real Life Example: Decorators
The actual bug we encountered in production was about applying decorators to a method. I can’t copy real production code for intellectual property protection reasons, but conceptually it went something like this:
from copy import deepcopy class RangeValidator: def __init__(self, max_value, decorator=None): self.max_value = max_value if decorator: self.validate = decorator(self.validate) def set_max_value(self, value): self.max_value = value return self def validate(self, number): return number < self.max_value def negate(validator): def inner(number): return not validator(number) return inner def double(validator): return deepcopy(validator).set_max_value(2*validator.max_value) at_least_100 = RangeValidator(100, negate) # Rejects anything that is under 100 at_least_200 = double(at_least_100) # Supposed to reject anything under 200, but doesn't print(at_least_100.validate(150)) # True print(at_least_200.validate(150)) # Also True <-- BUG print(at_least_200.max_value) # 200
The source of the bug is similar to the simple case, with two differences:
- We create an instance attribute
validate
that hides the original method. - The attribute is bound to the containing object indirectly. It references function
inner()
, that captures the originalvalidate()
method bound toat_least_100
.
It all works well until we copy this attribute to another instance, e.g. at_least_200
.
Note that in this case we use deepcopy()
, but it does not help: the bug is still there. See the next post to know why.
Solution 1: Use Free Function for Validation, Reapply the Decorator on Changes
Again, multiple solutions are possible, but the main point is to avoid callables stored in attribute that point to the containing instance, directly or indirectly.
We can store the decorator and recreate the validator when max value changes. Usage of the class remains the same:
from functools import partial def validate_range(number, max_value): return number < max_value class RangeValidator: def __init__(self, max_value, decorator=None): self.decorator = decorator self.set_max_value(max_value) def set_max_value(self, value): self.max_value = value # one-argument function that does not reference self self.validator = partial(validate_range, max_value=value) if self.decorator: # apply decorator to the function self.validator = self.decorator(self.validator) return self def validate(self, number): return self.validator(number)
Solution 2: Decorator Class
We can go full object-oriented and build a real decorator pattern. The drawback of this solution is that the decorator must expose the properties of the inner validator to make things work.
from copy import deepcopy class RangeValidator: def __init__(self, max_value): self.max_value = max_value def validate(self, number): return number < self.max_value class Negate: def __init__(self, validator): self.validator = validator def validate(self, number): return not self.validator.validate(number) @property def max_value(self): return self.validator.max_value @max_value.setter def max_value(self, value): self.validator.max_value = value def double(validator): copy = deepcopy(validator) copy.max_value = 2*validator.max_value return copy at_least_100 = Negate(RangeValidator(100)) at_least_200 = double(at_least_100) print(at_least_100.validate(150)) # True print(at_least_200.validate(150)) # False <-- THE BUG IS GONE print(at_least_200.max_value) # 200
Solution 3: Functions all the way down
Keep in mind that the problem arises when we copy an object that has self-referencing callables as attributes. Functions in Python can have attributes, and they cannot be copied, so if we use functions instead of classes, it will work great. We still have self-referencing attributes, but copy is no longer a problem. To be exact, one can call copy()
or deepcopy()
on a function, but it just returns the original object.
The resulting code, however, looks like a Javascript wannabe, and may be difficult to read for traditional Python programmers.
def RangeValidator(max_value, decorator=None): def this(): this.max_value = max_value validate = lambda number: number < this.max_value # don't use just max_value here! this.validate = decorator(validate) if decorator else validate this.copy = lambda: RangeValidator(this.max_value, decorator) this() # execute "constructor" return this def negate(validator): return lambda number: not validator(number) def double(validator): copy = validator.copy() copy.max_value = 2*validator.max_value return copy at_least_100 = RangeValidator(100, negate) at_least_200 = double(at_least_100) print(at_least_100.validate(50)) # False print(at_least_100.validate(150)) # True print(at_least_100.validate(250)) # True print(at_least_200.validate(50)) # False print(at_least_200.validate(150)) # False print(at_least_200.validate(250)) # True
Conclusion
Attributes that implicitly reference the containing object don’t work well when the object is mutable and copying is allowed. This may lead to very subtle bugs and is better avoided, even if the result is more verbose code.
See the next post for some exciting details on how dynamic methods and (deep)copying work under the hood, and fascinating interactions between them.
Permalink
Shameless plug: https://aklepatc.livejournal.com/1396.html
Permalink
Well, now we know that in the presence of copy this technique is dangerous.
If you were to copy the “initialized” instance in your example, it would continue to call foo() on the original.
Permalink
Makes perfect sense. Ty!
If the game is all about the same instance then no copy is involved, right?
Permalink