Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Suggestion / Idea  ObiUpdater needlessly allocates every frame
#1
I've been having performance issues with Obi Cloth, and have been spending lots of time in the profiler to get a handle on how to improve things.

I've noticed that Obi allocates quite a lot of memory each frame - in my scene with 6 clothed characters, `ObiLateFixedUpdater` is allocating ~2.5 kb each frame. At least some of this is very easily eliminated.

`ObiUpdater.BeginStep` and `ObiUpdater.Substep` both allocate a new temporary List each time they're called. This list can be moved into a member variable to avoid the repeated allocation:

Code:
private List<IObiJobHandle> m_JobHandles = new List<IObiJobHandle>();
protected void Substep(float stepDeltaTime, float substepDeltaTime, int index)
{
    using (m_SubstepPerfMarker.Auto())
    {
        // Kick off all solver jobs:
        foreach (ObiSolver solver in solvers)
            if (solver != null)
                this.m_JobHandles.Add(solver.Substep(stepDeltaTime, substepDeltaTime, index));

        // wait for all solver jobs to complete:
        foreach (IObiJobHandle handle in this.m_JobHandles)
            if (handle != null)
                handle.Complete();

        this.m_JobHandles.Clear();
    }
}

`ObiSolver.Update` also does some unnecessary allocation, because it calls `Mathf.Max(params float[])`, which allocates a temporary params array. (It also accesses `transform.lossyScale` - which I think involves a roundtrip out of managed code and into the engine - three times). I've made this minor tweak:

Code:
void Update()
{
    var lossyScale = transform.lossyScale;
    m_MaxScale = Mathf.Max(Mathf.Max(lossyScale.x, lossyScale.y), lossyScale.z);
}

These changes bring my allocations down to ~1.5kb per frame, which is a good start - but far from perfect!

I really like Obi, and I'm a bit surprised to see so much `GC.alloc` action every frame. I'd love to be able to do cloth simulation without any allocations... might there be a performance pass done in the future?

Cheers!
Reply
#2
(19-08-2021, 12:07 AM)timconkling Wrote: I've been having performance issues with Obi Cloth, and have been spending lots of time in the profiler to get a handle on how to improve things.

I've noticed that Obi allocates quite a lot of memory each frame - in my scene with 6 clothed characters, `ObiLateFixedUpdater` is allocating ~2.5 kb each frame. At least some of this is very easily eliminated.

`ObiUpdater.BeginStep` and `ObiUpdater.Substep` both allocate a new temporary List each time they're called. This list can be moved into a member variable to avoid the repeated allocation:

Code:
private List<IObiJobHandle> m_JobHandles = new List<IObiJobHandle>();
protected void Substep(float stepDeltaTime, float substepDeltaTime, int index)
{
    using (m_SubstepPerfMarker.Auto())
    {
        // Kick off all solver jobs:
        foreach (ObiSolver solver in solvers)
            if (solver != null)
                this.m_JobHandles.Add(solver.Substep(stepDeltaTime, substepDeltaTime, index));

        // wait for all solver jobs to complete:
        foreach (IObiJobHandle handle in this.m_JobHandles)
            if (handle != null)
                handle.Complete();

        this.m_JobHandles.Clear();
    }
}

`ObiSolver.Update` also does some unnecessary allocation, because it calls `Mathf.Max(params float[])`, which allocates a temporary params array. (It also accesses `transform.lossyScale` - which I think involves a roundtrip out of managed code and into the engine - three times). I've made this minor tweak:

Code:
void Update()
{
    var lossyScale = transform.lossyScale;
    m_MaxScale = Mathf.Max(Mathf.Max(lossyScale.x, lossyScale.y), lossyScale.z);
}

These changes bring my allocations down to ~1.5kb per frame, which is a good start - but far from perfect!

I really like Obi, and I'm a bit surprised to see so much `GC.alloc` action every frame. I'd love to be able to do cloth simulation without any allocations... might there be a performance pass done in the future?

Cheers!

Hi there!

Thanks a lot for the input! One of these (the updater list<> as a class member had been already addressed in the development branch, but I wasn't aware of the other one (Mathf.Max allocating a temporary array for any more than 2 arguments Indeciso ). Both improvements will be included.

Regularly I profile Obi and grab low-hanging fruit, however not all have been picked yet. Expect less garbage generation in upcoming releases.
Reply
#3
(19-08-2021, 01:00 PM)josemendez Wrote: Hi there!

Thanks a lot for the input! One of these (the updater list<> as a class member had been already addressed in the development branch, but I wasn't aware of the other one (Mathf.Max allocating a temporary array for any more than 2 arguments Indeciso ). Both improvements will be included.

Regularly I profile Obi and grab low-hanging fruit, however not all have been picked yet. Expect less garbage generation in upcoming releases.

That's great to hear, thanks Jose!
Reply
#4
The allocations required for the job handles returned by BeginStep and Substep are kinda "annoying". They return a new class instance every time (BurstJobHandle in the case of the Burst backend), and I see this is conceptually necessary because there are two backends and the shared interface IObiJobHandle is used. However, both those classes are so simple that I don't think this deserves an allocation each and every frame.

For my project, I added a pool for the burst job handles now, which I release before the Clear that @timconkling added. This gets rid of regular allocations and does not need much refactoring (I added a Release method to the IObiJobHandle interface). Something like this should definitely be adopted in Obi.

With this change and the others I proposed (http://obi.virtualmethodstudio.com/forum...-3251.html and http://obi.virtualmethodstudio.com/forum...-3250.html), I seem to be down to zero per-frame GC allocations for Obi.
Reply
#5
(28-12-2021, 11:05 AM)pdinklag Wrote: The allocations required for the job handles returned by BeginStep and Substep are kinda "annoying". They return a new class instance every time (BurstJobHandle in the case of the Burst backend), and I see this is conceptually necessary because there are two backends and the shared interface IObiJobHandle is used. However, both those classes are so simple that I don't think this deserves an allocation each and every frame.

For my project, I added a pool for the burst job handles now, which I release before the Clear that @timconkling added. This gets rid of regular allocations and does not need much refactoring (I added a Release method to the IObiJobHandle interface). Something like this should definitely be adopted in Obi.

With this change and the others I proposed (http://obi.virtualmethodstudio.com/forum...-3251.html and http://obi.virtualmethodstudio.com/forum...-3250.html), I seem to be down to zero per-frame GC allocations for Obi.

Hi pdinklag,

Thanks a lot for your reports/contributions! This one and the allocation in Decimate() will certainly make their way onto Obi's codebase.

Keep an eye for them in Obi 6.3.1.
Reply