Author Topic: Backtrace on segfault in NDK code  (Read 6480 times)

Offline fzurita

  • Moderator
  • double
  • *****
  • Posts: 558
    • View Profile
Backtrace on segfault in NDK code
« on: July 14, 2015, 05:35:48 PM »
What is the easiest way to get a back trace on crashes in NDK code?

When something crashes in NDK code, all I see is some memory address in the logcat. I'm getting a few crashes Mupen 64 plus core in the nexus player. I'm not seeing these crashes in my nvidia shield portable, so I suspect it's related to the x86 processor.

I would like to take a shot at fixing them.

Edit: Spelling and grammar
« Last Edit: July 15, 2015, 06:19:37 AM by fzurita »

Offline Gillou68310

  • Developer
  • long
  • *****
  • Posts: 112
    • View Profile
Re: Backtrace on segfault in NDK code
« Reply #1 on: July 15, 2015, 02:20:12 AM »
Using ndk-stack is the only way to get something readable from a segfault but depending on what caused the crash it may be completely unusable. ( ie: dynarec crashes )
Could you give more information on your crashes? Which game, which plugin, where it crashed, is it a random crash...?
Maybe I can help you to find the source ;)

Offline fzurita

  • Moderator
  • double
  • *****
  • Posts: 558
    • View Profile
Re: Backtrace on segfault in NDK code
« Reply #2 on: July 15, 2015, 06:16:57 AM »
I have a save state in super mario 64 where the crash is happening. This save state is not accessible to me until I get back home though. The crash is happening as you walk towards the first Bowser level as you are getting close to the big star door. This is happening with GLideN64 and it's 100% reproducible so far.

I'll post the save state once I get home tonight. I'll also try the ndk-stack command. It would be nice if you could run it right in the logcat output in Eclipse.
« Last Edit: July 15, 2015, 06:29:39 AM by fzurita »

Offline fzurita

  • Moderator
  • double
  • *****
  • Posts: 558
    • View Profile

Offline xperia64

  • Moderator
  • double
  • *****
  • Posts: 591
    • View Profile
    • My Apps
Re: Backtrace on segfault in NDK code
« Reply #4 on: July 16, 2015, 12:15:43 AM »
No crash for me and I tried every GLide variant. Only issue I had was the Peach-Bowser painting transition at the end of the hallway doesn't work with the GLES2 version

Offline Gillou68310

  • Developer
  • long
  • *****
  • Posts: 112
    • View Profile
Re: Backtrace on segfault in NDK code
« Reply #5 on: July 16, 2015, 01:38:35 AM »
@xperia64 which device are you testing on?

Offline fzurita

  • Moderator
  • double
  • *****
  • Posts: 558
    • View Profile
Re: Backtrace on segfault in NDK code
« Reply #6 on: July 16, 2015, 10:44:30 AM »
I'll try to get a back trace soon using ndk-stack. Hopefully it will give me a file and line number. Is the default build environment set to build using debug mode? From my experience, if it's not, it will be harder to get a file and line number.

Offline retroben

  • float
  • ****
  • Posts: 432
    • View Profile
Re: Backtrace on segfault in NDK code
« Reply #7 on: July 16, 2015, 11:10:54 PM »
While you're at it,how about looking at Castlevania 64 since it segfaults at the first skeleton cutscene?  ;)

Offline fzurita

  • Moderator
  • double
  • *****
  • Posts: 558
    • View Profile
Re: Backtrace on segfault in NDK code
« Reply #8 on: July 18, 2015, 11:46:44 PM »
Ok. So this doesn't look like a standard seg fault... See this:

Code: [Select]
07-19 00:44:03.506: A/libc(30707): Fatal signal 11 (SIGSEGV), code 1, fault addr 0x0 in tid 31747 (CoreThread)

It seems like we get a thread id, but not backtrace at all. And the fault addr of 0x0 is very funky.

Edit: it's definately a problem with GLideN64, the crash doesn't happen with Glide 64.
« Last Edit: July 19, 2015, 12:11:06 AM by fzurita »

Offline fzurita

  • Moderator
  • double
  • *****
  • Posts: 558
    • View Profile
Re: Backtrace on segfault in NDK code
« Reply #9 on: July 19, 2015, 11:31:59 PM »
i was able to get a backtrace using some tutorials online, i was not able to get a backtrace using ndk stack due to the single line backtrace in logcat:
07-20 00:18:40.141: I/app_name(4593):   # 0: 0xe38475c4 
07-20 00:18:40.141: I/app_name(4593):   # 1: 0xe3847ad5 
07-20 00:18:40.141: I/app_name(4593):   # 2: 0xf77a65c1  InvokeUserSignalHandler
07-20 00:18:40.141: I/app_name(4593):   # 3: 0xf440bad9 
07-20 00:18:40.141: I/app_name(4593):   # 4: 0xf74df380 
07-20 00:18:40.141: I/app_name(4593):   # 5: 0xec103402 
07-20 00:18:40.141: I/app_name(4593):   # 6: 0xec0e2cfc 
07-20 00:18:40.141: I/app_name(4593):   # 7: 0xec0f4e68 
07-20 00:18:40.141: I/app_name(4593):   # 8: 0xec12fc8a 
07-20 00:18:40.141: I/app_name(4593):   # 9: 0xec167ebc  GLSLCompileToUniflex
07-20 00:18:40.141: I/app_name(4593):   #10: 0xec1f27c0 
07-20 00:18:40.141: I/app_name(4593):   #11: 0xec1f2ff1  glCompileShader
07-20 00:18:40.141: I/app_name(4593):   #12: 0xe3886573 
07-20 00:18:40.141: I/app_name(4593):   #13: 0xe3844df1 
07-20 00:18:40.141: I/app_name(4593):   #14: 0xe384656e 
07-20 00:18:40.141: I/app_name(4593):   #15: 0xe386b569 
07-20 00:18:40.141: I/app_name(4593):   #16: 0xe386bf00 
07-20 00:18:40.141: I/app_name(4593):   #17: 0xe386c80f 
07-20 00:18:40.141: I/app_name(4593):   #18: 0xe386193b 
07-20 00:18:40.141: I/app_name(4593):   #19: 0xe384a541 
07-20 00:18:40.141: I/app_name(4593):   #20: 0xe387344e 
07-20 00:18:40.141: I/app_name(4593):   #21: 0xe3880805 
07-20 00:18:40.141: I/app_name(4593):   #22: 0xe384703d  ProcessDList
07-20 00:18:40.141: I/app_name(4593):   #23: 0xe367678c 
07-20 00:18:40.141: I/app_name(4593):   #24: 0xe366918b 
07-20 00:18:40.141: I/app_name(4593):   #25: 0xe367693e  DoRspCycles
07-20 00:18:40.141: I/app_name(4593):   #26: 0xdf1be591 
07-20 00:18:40.141: I/app_name(4593):   #27: 0xdf15165a 
07-20 00:18:40.141: I/app_name(4593):   #28: 0xdbcaf367 

Edit 1: This is pretty useless, it's missing too many symbols. I got this by catching the segmentation fault signal.

Edit 2: I think I installed an app that is preventing android from successfully creating core dumps. Unfortunately, to fix this, I have to reinstall my nexus player system image...

Edit 3: This may not be that useless. It seems to be crashing while compiling a specific shader. I can add some logging every time we compile a shader and with that I can figure out which shader is the one causing the issue.

« Last Edit: July 20, 2015, 11:47:28 PM by fzurita »

Offline fzurita

  • Moderator
  • double
  • *****
  • Posts: 558
    • View Profile
Re: Backtrace on segfault in NDK code
« Reply #10 on: July 27, 2015, 11:03:50 PM »
Edit: Improved readability

Ok.... I finally had success, here is the real stack trace:

Quote
********** Crash dump: **********
Build fingerprint: 'Android/full_fugu/fugu:5.1.1/LMY47V/cloudproject05011142:user/release-keys'
pid: 3393, tid: 3648, name: CoreThread  >>> org.mupen64plusae.v3.alpha <<<
signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x0
#00 pc 00039402  libglslcompiler.so
#01 pc 00018cfb  libglslcompiler.so
#02 pc 0002ae67  libglslcompiler.so
#03 pc 00065c89  libglslcompiler.so
#04 pc 0009debb  libglslcompiler.so (GLSLCompileToUniflex+507)
#05 pc 000547bf  libGLESv2_POWERVR_ROGUE.so
#06 pc 00054ff0  libGLESv2_POWERVR_ROGUE.so (glCompileShader+48)
#07 pc 00049b32  libmupen64plus-video-gliden64-gles31.so: Routine ShaderCombiner::ShaderCombiner(Combiner&, Combiner&, gDPCombine const&) at /mupen64plus-video-gliden64/src/OGL3X/GLSLCombiner_ogl3x.cpp:387
#08 pc 000096a0  libmupen64plus-video-gliden64-gles31.so: Routine CombinerInfo::_compile(unsigned long long) const at /mupen64plus-video-gliden64/src/Combiner.cpp:210
#09 pc 0000ae1d  libmupen64plus-video-gliden64-gles31.so: Routine CombinerInfo::setCombine(unsigned long long) at /mupen64plus-video-gliden64/src/Combiner.cpp:239
#10 pc 0002f3e8  libmupen64plus-video-gliden64-gles31.so: Routine OGLRender::_updateStates(OGLRender::RENDER_STATE) const at /mupen64plus-video-gliden64/src/OpenGL.cpp:550
#11 pc 0002fd7f  libmupen64plus-video-gliden64-gles31.so: Routine OGLRender::_prepareDrawTriangle(bool) at /mupen64plus-video-gliden64/src/OpenGL.cpp:680
#12 pc 0003068e  libmupen64plus-video-gliden64-gles31.so: Routine OGLRender::drawTriangles() at /mupen64plus-video-gliden64/src/OpenGL.cpp:768
#13 pc 0002578a  libmupen64plus-video-gliden64-gles31.so: Routine gSPFlushTriangles at /mupen64plus-video-gliden64/src/gSP.cpp:30
#14 pc 0000e360  libmupen64plus-video-gliden64-gles31.so: Routine F3D_Tri1(unsigned int, unsigned int) at /mupen64plus-video-gliden64/src/F3D.cpp:119
#15 pc 00036945  libmupen64plus-video-gliden64-gles31.so: Routine RSP_ProcessDList() at /mupen64plus-video-gliden64/src/RSP.cpp:203
#16 pc 00043ca4  libmupen64plus-video-gliden64-gles31.so: Routine PluginAPI::ProcessDList() at /mupen64plus-video-gliden64/src/common/CommonAPIImpl_common.cpp:86
#17 pc 0000b7ac  libmupen64plus-video-gliden64-gles31.so (ProcessDList+28): Routine ProcessDList at /mupen64plus-video-gliden64/src/CommonPluginAPI.cpp:23
#18 pc 0001578b  libmupen64plus-rsp-hle.so: Routine HleProcessDlistList at /mupen64plus-rsp-hle/src/plugin.c:101
#19 pc 0000818a  libmupen64plus-rsp-hle.so: Routine forward_gfx_task at /mupen64plus-rsp-hle/src/hle.c:172
#20 pc 0001593d  libmupen64plus-rsp-hle.so (DoRspCycles+29): Routine DoRspCycles at /mupen64plus-rsp-hle/src/plugin.c:182
#21 pc 00099590  libmupen64plus-core.so: Routine do_SP_Task at /mupen64plus-core/src/rsp/rsp_core.c:281
#22 pc 0002c659  libmupen64plus-core.so: Routine writew at /mupen64plus-core/src/memory/memory.c:152
#23 pc 0002f19e  <unknown>: Unable to open symbol file \mupen64plus-ae\obj\local\x86/<unknown>. Error (22): Invalid argument
« Last Edit: July 28, 2015, 06:41:42 AM by fzurita »

Offline fzurita

  • Moderator
  • double
  • *****
  • Posts: 558
    • View Profile
Re: Backtrace on segfault in NDK code
« Reply #11 on: July 27, 2015, 11:10:16 PM »
Edit: Improved readability

And here is the shader program that is causing issues, it seems to be incomplete, no logic is being done:

Code: [Select]
Error in fragment shader:
#version 310 es
#ifdef GL_NV_fragdepth
    #extension GL_NV_fragdepth : enable
#endif
#ifdef GL_OES_standard_derivatives
    #extension GL_OES_standard_derivatives : enable
#endif
uniform sampler2D uTex0;
uniform sampler2D uTex1;
uniform lowp sampler2DMS uMSTex0;
uniform lowp sampler2DMS uMSTex1;
uniform lowp ivec2 uMSTexEnabled;
layout (std140) uniform ColorsBlock {
  lowp vec4 uFogColor;
  lowp vec4 uCenterColor;
  lowp vec4 uScaleColor;
  lowp vec4 uBlendColor;
  lowp vec4 uEnvColor;
  lowp vec4 uPrimColor;
  lowp float uPrimLod;
  lowp float uK4;
  lowp float uK5;
};
uniform mediump vec2 uScreenScale;
uniform lowp int uAlphaCompareMode;
uniform lowp int uAlphaDitherMode;
uniform lowp int uColorDitherMode;
uniform lowp int uGammaCorrectionEnabled;
uniform lowp int uFogUsage;
uniform lowp ivec2 uFb8Bit;
uniform lowp ivec2 uFbFixedAlpha;
uniform lowp int uSpecialBlendMode;
uniform lowp int uEnableAlphaTest
07-28 00:08:52.453: A/libc(4204): Fatal signal 11 (SIGSEGV), code 1, fault addr 0x0 in tid 4295 (CoreThread)

Hmm.... just by looking at the code, there should be way more in there. Logcat must not be printing out the whole thing.
« Last Edit: July 28, 2015, 07:10:06 AM by fzurita »

Offline fzurita

  • Moderator
  • double
  • *****
  • Posts: 558
    • View Profile
Re: Backtrace on segfault in NDK code
« Reply #12 on: August 01, 2015, 10:31:20 PM »
So, I was able to trace to core dump down to a very specific GLSL call. Any use of the dFdx function within GLSL causes the glCompileShader call to crash. I do notice that you are supposed to use the extension GL_OES_standard_derivatives within the GLSL shader to be able to use dFdx, but Gonetz seems to be doing that correctly. Anyone have any ideas?

Offline fzurita

  • Moderator
  • double
  • *****
  • Posts: 558
    • View Profile
Re: Backtrace on segfault in NDK code
« Reply #13 on: August 01, 2015, 10:41:59 PM »
Hmm... I can't believe that fixed it. So getting rid of this fixes the issue:

"#ifdef GL_OES_standard_derivatives         \n"
"    #extension GL_OES_standard_derivatives : enable \n"
"#endif                              \n"

Even after taking out that line, the LOD shader keeps working correctly.

Offline xperia64

  • Moderator
  • double
  • *****
  • Posts: 591
    • View Profile
    • My Apps
Re: Backtrace on segfault in NDK code
« Reply #14 on: August 01, 2015, 10:46:43 PM »
I don't know much about OpenGL programming but it seems a bit odd to check extensions in a compiler macro. I would think they should be checked dynamically on the device that's running the GL code.