Saturday, April 27, 2013

Using ARC with non iOS compiler - possible?

Did you ever have the situation with 2 lists sharing the same objects and possibly one or both had the OwnsObjects property set to true? Especially when using interfaced lists like in Spring4d or DSharp you want to make use of reference couting and automatic memory management. But what if some instances are shared between different lists or moved from one list to the other. You cannot use the Delete or Remove method and then add the object to the other list of vice versa if the first list has OwnsObjects true because it will destroy the instance when it gets removed. That is why there is the Extract method which removes the instance but does not destroy it. This can get complicated and possibly leading to memleaks or exceptions quickly.

Using interfaced objects for simple data storages might not be a very good idea. It requires you to write interfaces with the properties just for that purpose and if you use some kind of RTTI based binding you cannot do that as there is no RTTI for interface properties (which are just syntax sugar anyway).

So what would we give for some easy memory management in these cases. How about using the concept we know from interfaces and other types for objects?

Of course we could write some TRefCountObject and inherit our data classes from that base class and handle these in our Notify method inside the list when items get added or removed. But that would be to easy and not magic at all. ;) And more seriously it does not always work to change the base type due to several reasons.

So what do we need? Basically just a new field inside our object to keep track of the reference count. Keep in mind we cannot do a full ARC implementation because that would include assignments and parameter passing which would need compiler support. We just want it for when objects are put into lists.

The method that is responsible for allocating the memory of new instances is TObject.NewInstance. So we need to replace that:

procedure InitializeARC;
var
  Buffer: array[0..4] of Byte;
begin
  Buffer[0] := $E9
  // redirect TObject.NewInstance
  PInteger(@Buffer[1])^ := PByte(@NewInstance) - (PByte(@TObject.NewInstance) + 5);
  WriteMemory(@TObject.NewInstance, @Buffer, 5);
end;

What this code does is place a jump instruction at the very beginning of the TObject.NewInstance method that redirects it to our NewInstance routine which looks like this:

function NewInstance(Self: TClass): TObject;
begin
  // get additional memory for the RefCount field
  GetMem(Pointer(Result), Self.InstanceSize + SizeOf(Integer));
  Result := InitInstance(Self, Result);
end;

It does basically the same as the original except that it allocates 4 bytes more for our RefCount field and then calls our version of InitInstance (which is responsable for initializing the object):

function InitInstance(Self: TClass; Instance: Pointer): TObject;
const
  Buffer: Pointer = @BeforeDestruction;
begin
  Result := Self.InitInstance(Instance);

  // initialize the RefCount field
  GetRefCountFieldAddress(Instance)^ := 0;

  // replace TObject.BeforeDestruction
  if PPointer(NativeInt(Self) + vmtBeforeDestruction)^ = @TObject.BeforeDestruction then
    WriteMemory(PPointer(NativeInt(Self) + vmtBeforeDestruction), @Buffer, SizeOf(Pointer));
end;

Since TObject.InitInstance just zeroes the memory the RTL knows about (obtained by calling InstanceSize) we need to set our field which sits on the last 4 bytes in our instance:

function GetRefCountFieldAddress(Instance: TObject): PInteger; inline;
begin
  // the RefCount field was added last
  Result := PInteger(NativeInt(Instance) + Instance.InstanceSize);
end;

Along with the reference couting we want to make sure that the instance is not getting destroyed when it is still managed by the RefCount (because it sits in some list). That is why the BeforeDestruction method gets replaced. Why not detour like NewInstance? The implementation in TObject is empty so there are not 5 bytes of available that we can overwrite to jump to our implementation. But as it is virtual we can replace it in the classes VMT. Like its implementation in TInterfacedObject it will raise an error when the RefCount is not 0.

procedure BeforeDestruction(Self: TObject);
begin
  if GetRefCount(Self) <> 0 then
    System.Error(reInvalidPtr);
end;

Implementing the actual AddRef and Release routines is pretty easy aswell:

function __ObjAddRef(Instance: TObject): Integer;
begin
  Result := InterlockedIncrement(GetRefCountFieldAddress(Instance)^);
end;

function __ObjRelease(Instance: TObject): Integer;
begin
  Result := InterlockedDecrement(GetRefCountFieldAddress(Instance)^);
  if Result = 0 then
    Instance.Destroy;
end;

The most important thing: You need to add the unit which contains this as the very first unit in your project  (or after ShareMem) so the NewInstance method gets patched as soon as possible.

Time to test if it does what it should:

implementation

{$R *.dfm}

uses
  DSharp.Collections,
  DSharp.Core.ARC;

type
  TList<T: class> = class(DSharp.Collections.TList<T>)
  protected
    procedure Notify(const Value: T; const Action: TCollectionChangedAction); override;
  end;

procedure TList<T>.Notify(const Value: T;
  const Action: TCollectionChangedAction);
begin
  case Action of
    caAdd: __ObjAddRef(Value);
    caRemove: __ObjRelease(Value);
  end;
end;

procedure TForm1.Button1Click(Sender: TObject);
var
  list1, list2: IList<TObject>;
begin
  list1 := TList<TObject>.Create;
  list2 := TList<TObject>.Create;

  list1.Add(TObject.Create);
  list1.Add(TObject.Create);
  list2.AddRange(list1);
  list1.Delete(1);
end;

initialization
  ReportMemoryLeaksOnShutdown := True;

end.

When we click the button both objects get added to both lists and the last list containing an object will cause it to get destroyed when removed (which happens if the list gets destroyed aswell).

So far this is more of a proof of concept but I think this can make some code easier and less complicated especially when working a lot with lists and moving around objects without knowing what list at the end owns the objects.

You can find that code in the svn repository and as always your feedback is welcome.