Do I need to do anything special for CASCADE deletions in Elixir?
Nothing special is required. The preferred and most efficient approach is to exploit the database's own CASCADE capability by specifying the appropriate "ON <ACTION> CASCADE" clause when the table is CREATEd. This is how to specify the appropriate clauses in Elixir relation statements (see discussion below for details of "passive_deletes" option):
class MyThing(Entity): [ ... ] children = elixir.ManyToMany('SomeChildThing', [...], onupdate = 'CASCADE', ondelete = 'CASCADE' ) using_options(passive_deletes=True)
When the underlying database does not provide CASCADE behaviour (SQLite <=3.0 and MySQL MyISAM tables), then SA will provide that behaviour when instructed via the "cascade" keyword in a relation (example provided below).
The issue is aired in a sqlelixir post:
(http://groups.google.com/group/sqlelixir/browse_thread/thread/a13bb362b4311d76)
Harish's Question
I have a question regarding using the ORM and inefficient deleting.
Let's say I have a model where A and B have a relationship such that there may be many, many, many B for each A. Like thousands or tens of thousands. Let's also say that I have an instance of 'a' and the 'b' attribute is lazy-loaded, and furthermore is marked with cascade="all,delete-orphan". If I delete 'a', why does the ORM have to load every single 'B' to delete them?
sess.delete(a) sess.flush() # results in loading every 'B' instance and then deleting them by id.
I see SQL like this:
SELECT b.id, b.attr1, b.attr2, .... where b.a_id = %s .... for a.id
followed by
DELETE from B_TABLE where b.id in ( ..... )
Instead, why don't I see:
DELETE from B_TABLE where a_id=%s .... for a.id
the latter being much more efficient, and SA already knows about the relationship.
Mike Bayer's response
The reason this behavior is not implicit because SQLA is expected by default to cascade a delete operation to all child items without reliance upon the database's own CASCADE behavior. If each B had relations to a bunch of C's for example, SQLA would have to load every B and either delete each C or mark their relation to B as NULL - the process continues for as much depth is required. The ORM doesn't try to implement the complexity of guessing whether or not B's are safe to be deleted en-masse or not. Instead, it provides an explicit option which allows transparently integrating with ON DELETE CASCADE, which is the most efficient way possible.
Set ON DELETE CASCADE within your schema in conjunction with the "passive_deletes" flag on relation(), which will tell SQLA to not load any B's that are not already present in the session - the database will take care of deleting them. If each B is already locally present, then SQLA will still issue an individual DELETE for each B.
Using Passive Deletes
The SA 0.4 documentation on passive deletes provides a few more details ...
Use passive_deletes=True to disable child object loading on a DELETE operation, in conjunction with "ON DELETE (CASCADE|SET NULL)" on your database to automatically cascade deletes to child objects. Note that "ON DELETE" is not supported on SQLite, and requires InnoDB tables when using MySQL:
mytable = Table('mytable', meta, Column('id', Integer, primary_key=True), ) myothertable = Table('myothertable', meta, Column('id', Integer, primary_key=True), Column('parent_id', Integer), ForeignKeyConstraint(['parent_id'],['mytable.id'], ondelete="CASCADE"), ) mmapper(MyOtherClass, myothertable) mapper(MyClass, mytable, properties={ 'children':relation(MyOtherClass, cascade="all, delete-orphan", passive_deletes=True) })
When passive_deletes is applied, the children relation will not be loaded into memory when an instance of MyClass is marked for deletion. The cascade="all, delete-orphan" will take effect for instances of MyOtherClass which are currently present in the session; however for instances of MyOtherClass which are not loaded, SQLAlchemy assumes that "ON DELETE CASCADE" rules will ensure that those rows are deleted by the database and that no foreign key violation will occur.
Elsewhere, other nuances are exposed ...
Pylons wiki: "Using SQLAlchemy with Pylons"
(http://wiki.pylonshq.com/display/pylonsdocs/Using+SQLAlchemy+with+Pylons)
For any relationship, you can add cascade = "all, delete-orphan" as an extra argument to relation() in your mappers to ensure that when a join is deleted the joined object is deleted as well ... Beware though that despite its name, delete-orphan removes joined objects even if another object is joined to it.
Pythonisto blog: Cascade Rules in SQLAlchemy
(http://blog.pythonisito.com/2008/07/cascade-rules-in-sqlalchemy.html)
What's "cascading" in the mapper is session-based operations. This includes putting an object into the session (saving it), deleting an object from the session, etc. Generally, you don't care about all that stuff, because it Just Works most of the time, as long as you specify cascade="all" on your relation() properties in your mappers. What this means is "whatever session operation you do to the mapped class, do it to the related class as well".
One little confusing thing is that there's another thing you'll often want to specify in your cascade rules, and that's the "delete-orphan". In fact, most of my 1:N relation()s look like:
mapper(ParentClass, parent, properties=dict( children=relation(ChildClass, backref='parent', cascade='all,delete-orphan') ) )
The "delete-orphan" specifies that if you ever have a ChildClass instance that is "orphaned", that is, not connected to some ParentClass, go ahead and delete that ChildClass. You want to specify this whenever you don't want ChildClass instances hanging out with null ParentClass references. Note that even if you don't specify "delete-orphan", deletes on the ParentClass instance will still cascade to related ChildClass instances.
Executable Example:
from elixir import * metadata.bind = 'sqlite:///:memory:' # metadata.bind = 'sqlite:///movies.sqlite' # metadata.bind = 'mysql://localhost/cascade_test' metadata.bind.echo = True class Movie(Entity): title = Field(Unicode(30), primary_key=True) year = Field(Integer, primary_key=True) description = Field(UnicodeText) releasedate = Field(DateTime) director = ManyToOne('Director') genres = ManyToMany('Genre', cascade="all,delete-orphan") actors = ManyToMany('Actor', tablename='movie_casting', cascade="all,delete-orphan") def __repr__(self): return '<Movie "%s" (%d)>' % (self.title, self.year) class Person(Entity): using_options(inheritance='multi') name = Field(Unicode(60)) def __repr__(self): return '<Person "%s">' % self.name class Actor(Person): using_options(inheritance='multi') movies = ManyToMany('Movie', tablename='movie_casting') def __repr__(self): return '<Actor "%s">' % self.name class Director(Person): using_options(inheritance='multi') movies = OneToMany('Movie', cascade="all,delete-orphan") def __repr__(self): return '<Director "%s">' % self.name class Genre(Entity): name = Field(Unicode(15), primary_key=True) movies = ManyToMany('Movie') def __repr__(self): return '<Genre "%s">' % self.name setup_all() create_all() rscott = Director(name=u"Ridley Scott") glucas = Director(name=u"George Lucas") alien = Movie(title=u"Alien", year=1979) swars = Movie(title=u"Star Wars", year=1977) brunner = Movie(title=u"Blade Runner", year=1982) rscott.movies.append(brunner) rscott.movies.append(alien) swars.director = glucas print rscott.movies assert session.query(Movie).filter_by(director=rscott).count() == 2 session.delete(rscott) assert session.query(Movie).filter_by(director=rscott).count() == 0 session.commit()
