Posts: 50
Threads: 14
Joined: Nov 2021
Reputation:
0
17-01-2022, 10:42 AM
(This post was last modified: 17-01-2022, 10:56 AM by lufydad.)
I have a scene with multiple obi solvers, and it make my scene run at 2 fps, with just 3 ropes by obi solver.
When I move the obi rope in a single obi solver and disable all obi solvers I get acceptable performance (30 fps). (ie. https://imgur.com/a/eCAepal)
In the documentation it's written "They can be added to any GameObject in your scene and there can be more than one solver working simultaneously in the same scene.". But does this mean that each solver added will have a great impact on performances ?
Posts: 6,323
Threads: 24
Joined: Jun 2017
Reputation:
400
Obi Owner:
17-01-2022, 11:01 AM
(This post was last modified: 17-01-2022, 11:06 AM by josemendez.)
(17-01-2022, 10:42 AM)lufydad Wrote: I have a scene with multiple obi solvers, and it make my scene run at 2 fps, with just 3 ropes by obi solver.
When I move the obi rope in a single obi solver and disable all obi solvers I get acceptable performance (30 fps). (ie. https://imgur.com/a/eCAepal)
In the documentation it's written "They can be added to any GameObject in your scene and there can be more than one solver working simultaneously in the same scene.". But does this mean that each solver added will have a great impact on performances ?
Hi there!
The performance difference of one solver vs multiple solvers is negligible. All actors in all solvers are updated in parallel as long as they're all updated by the same ObiUpdater component, so it doesn't matter how many solvers you have.
However having a separate updater for each solver negates any parallelism between solvers, they will be updated sequentially (one after another) resulting in much worse performance. See:
http://obi.virtualmethodstudio.com/manua...aters.html
Quote:Typically, you'd want to have only one updater in your scene. This allows tasks dispatched by all solvers in the updater to be executed in parallel.
If the updater and the solver are in the same GameObject, when you copied the solver you also copied the updater and this is probably the root of the issue. Just make sure all solvers are updated by the same updater.
This alone does not explain the *huge* performance hit you see, thought. The magnitude of the performance drop is due to death spiraling. This happens in any game engine when your game physics steps takes longer to simulate than the amount of "physics time" it is simulating.
If you open up the profiler, you will see more than 1 call to FixedUpdate() per frame. To fix this, go to Project Settings->Time and set your maximum fixed timestep to small value multiple of your timestep. For instance if your timestep is 0.02 (the default), set the maximum to 0.04.
Posts: 50
Threads: 14
Joined: Nov 2021
Reputation:
0
Thanks ! It works a lot better with only one ObiFixedUpdater .
For the " death spiral", I'm aware of it and I currently have some algorithm that dynamically choose a fixedDeltaTime between 1/30 and 1/120 to be more consistent with the simulation time.
However, I still have 2 problems with performance :
- When my 4 gameobjects with ropes are instantiate, I have a performance hit, due to obi fixed update (see capture https://imgur.com/a/0V04YNb). Have you an idea of where it comes from ? And how could I reduce it ?
- Globally with 12 ropes of a few meters (approximately 1500 particles in total), 2 iterations for bending and distance, The performances are "only" near 30 fps. Is it normal ? (Unity 2021.2.4, CPU : AMD Ryzen 7 4800h, GPU : Nvidia RTX 2060, RAM : 8 GO)
Posts: 6,323
Threads: 24
Joined: Jun 2017
Reputation:
400
Obi Owner:
(17-01-2022, 03:21 PM)lufydad Wrote: [*]When my 4 gameobjects with ropes are instantiate, I have a performance hit, due to obi fixed update (see capture https://imgur.com/a/0V04YNb). Have you an idea of where it comes from ? And how could I reduce it ?
Never seen something similar, tbh. The cost of a FixedUpdate call is independent of where in the frame it happens. Could you get deeper into the calls tack and see where is this coming from?
(17-01-2022, 03:21 PM)lufydad Wrote: [*]Globally with 12 ropes of a few meters (approximately 1500 particles in total), 2 iterations for bending and distance, The performances are "only" near 30 fps. Is it normal ? (Unity 2021.2.4, CPU : AMD Ryzen 7 4800h, GPU : Nvidia RTX 2060, RAM : 8 GO)
[/list]
Profile to see whether the culprit is simulation (very unlikely) or mesh generation/rendering, which happens during solver.Interpolate(). If that's the issue, using simpler rendering method and/or making use of aggressive mesh decimation will help. Decimation can be enabled in the ObiPathSmoother component.
Posts: 50
Threads: 14
Joined: Nov 2021
Reputation:
0
I see another strange thing when the gameobjects are instantiated : https://imgur.com/a/kScsSBR
(17-01-2022, 08:14 PM)josemendez Wrote: Never seen something similar, tbh. The cost of a FixedUpdate call is independent of where in the frame it happens. Could you get deeper into the calls tack and see where is this coming from?
Profile to see whether the culprit is simulation (very unlikely) or mesh generation/rendering, which happens during solver.Interpolate(). If that's the issue, using simpler rendering method and/or making use of aggressive mesh decimation will help. Decimation can be enabled in the ObiPathSmoother component.
In the 2 cases it seems to come from the the ApplyConstraint (see https://imgur.com/a/8XIcdFG).
The interpolation costs only 3 ms, while fixed update costs 13ms.
Posts: 6,323
Threads: 24
Joined: Jun 2017
Reputation:
400
Obi Owner:
20-01-2022, 11:00 AM
(This post was last modified: 20-01-2022, 11:05 AM by josemendez.)
(19-01-2022, 05:18 PM)lufydad Wrote: I see another strange thing when the gameobjects are instantiated : https://imgur.com/a/kScsSBR
In the 2 cases it seems to come from the the ApplyConstraint (see https://imgur.com/a/8XIcdFG).
The interpolation costs only 3 ms, while fixed update costs 13ms.
Noticed a suspicious thing in that profiler pic:
60 (!) calls to FixedUpdate, I guess since you're using deep profiling that this is in part due to death spiral. (0.33/0.02 = 17 calls per frame at worst, times 3 player updates = 51 calls, does not quite add up but it's close).
However took a closer look at your original profiler pic, and during the spike there's 42 calls to FixedUpdate, so this is definitely not caused by deep profiling. Seems like your simulation starts and immediately enters a downwards spiral, then it recovers from it shortly after.
Things that might cause this:
- Maybe you're using async compilation in Burst? This will lazily compile your jobs on the fly the first time they're executed, during gameplay. The result is that the first few frames of anything that uses Burst are a lot slower since A) it's not actually using Burst yet B) code is being compiled in the background as the game runs. This only happens in-editor (standalone is precompiled at build time) and can be optionally disabled so all Burst code is compiled when you press play. From the manual:
http://obi.virtualmethodstudio.com/manua...kends.html
Quote:keep in mind that Burst uses asynchronous compilation in the editor by default. This means that the first few frames of simulation will be noticeably slower, as Burst is still compiling jobs while the scene runs. You can enable synchronous compilation in the Jobs->Burst menu, this will force Burst to compile all jobs before entering play mode.
- Instantiating objects in the solver does have some overhead, as particle data must be converted from actor to solver space and constraint batches from different actors merged. Maybe this extra overhead is enough to tip performance over the edge of the spiral? Does this happen with fewer actors (ropes) in your scene?
Don't know if I'm on the right track here, the spiral is always just a symptom of other issues but these seem like the most plausible to me. Let me know how it goes!
Posts: 50
Threads: 14
Joined: Nov 2021
Reputation:
0
(20-01-2022, 11:00 AM)josemendez Wrote: - Maybe you're using async compilation in Burst? This will lazily compile your jobs on the fly the first time they're executed, during gameplay. The result is that the first few frames of anything that uses Burst are a lot slower since A) it's not actually using Burst yet B) code is being compiled in the background as the game runs. This only happens in-editor (standalone is precompiled at build time) and can be optionally disabled so all Burst code is compiled when you press play.
Thanks for the tip about the async compilation, I didn't know it, although I use jobs in other places. It improve the performance a bit, but it doesn't seems to be the heart of the problem.
(20-01-2022, 11:00 AM)josemendez Wrote: - Instantiating objects in the solver does have some overhead, as particle data must be converted from actor to solver space and constraint batches from different actors merged. Maybe this extra overhead is enough to tip performance over the edge of the spiral? Does this happen with fewer actors (ropes) in your scene?
With only 3 actors (instead of 12) the problem seems similar : https://imgur.com/a/r2mQwZy. It looks like the death spiral is triggered by the function AddActor of ObiSolver (see https://imgur.com/a/eYkvepv).
Is there a way to improve it ? by changing some parameters for example ?
Posts: 6,323
Threads: 24
Joined: Jun 2017
Reputation:
400
Obi Owner:
21-01-2022, 03:57 PM
(This post was last modified: 21-01-2022, 04:03 PM by josemendez.)
(21-01-2022, 03:50 PM)lufydad Wrote: With only 3 actors (instead of 12) the problem seems similar : https://imgur.com/a/r2mQwZy. It looks like the death spiral is triggered by the function AddActor of ObiSolver (see https://imgur.com/a/eYkvepv).
Is there a way to improve it ? by changing some parameters for example ?
For only 12 actors, AddActor() should be quite fast. I see no reason for it to trigger this.
I'm out of ideas at this point. Would it be possible for you to share with me a scene/project that reproduces this? You can send it to support(at)virtualmethodstudio.com, I will take a look and hopefully find the culprit.
(*)Edit: How many colliders do you have in your scene? in particular, how many MeshColliders? When a backend is first initialized, collider data is generated. If you have many different mesh colliders, the cost of building BVHs for all those can add up. That would explain that the amount of actors in the scene does not seem to affect performance, but this is just a guess.
kind regards,
Posts: 50
Threads: 14
Joined: Nov 2021
Reputation:
0
(21-01-2022, 03:57 PM)josemendez Wrote: For only 12 actors, AddActor() should be quite fast. I see no reason for it to trigger this.
I'm out of ideas at this point. Would it be possible for you to share with me a scene/project that reproduces this? You can send it to support(at)virtualmethodstudio.com, I will take a look and hopefully find the culprit.
(*)Edit: How many colliders do you have in your scene? in particular, how many MeshColliders? When a backend is first initialized, collider data is generated. If you have many different mesh colliders, the cost of building BVHs for all those can add up. That would explain that the amount of actors in the scene does not seem to affect performance, but this is just a guess.
kind regards,
Unfortunately I can't give you a scene, because it's a huge project that I can't share at all... but thank you for offering !
For colliders, there is like dozen of mesh colliders and box colliders, could it be the cause of all this ?
Excuse my ignorance but what are BVHs ?
Posts: 6,323
Threads: 24
Joined: Jun 2017
Reputation:
400
Obi Owner:
(21-01-2022, 04:39 PM)lufydad Wrote: For colliders, there is like dozen of mesh colliders and box colliders, could it be the cause of all this ?
Depends on the mesh colliders, how many triangles each?
(21-01-2022, 04:39 PM)lufydad Wrote: Excuse my ignorance but what are BVHs ?
Bounding Volume Hierarchies. It's the acceleration data structure used to perform collision queries against triangle meshes.
Since testing each particle against each individual triangle in each collider is too much work, a BVH for each mesh is generated. This allows to quickly prune triangles that are not close to a given particle, and allows to test each particle agains just a few (2-3) triangles. Basically, it makes collision detection fast enough to run in realtime.
The thing is that generating the BVH takes some time. In Obi a better alternative to MeshColliders are distance fields, since A) they're built at edit time and stored in disk, so no runtime overhead B) they're faster and more robust than MeshColliders.
|