1

I don't yet fully understand how segfaults and backtraces work, but I get the impression that if the function at the top of the list references "glib" or "gobject", you have Bad Issues(TM) with libraries that usually shouldn't go wrong.

Well, that's what I'm getting here, from two completely different programs.

The first is the latest build of irssi, compiled (cleanly, without any glitches or errors) directly from github.com.

Program received signal SIGSEGV, Segmentation fault.
0xb7cf77ea in g_ascii_strcasecmp () from /usr/lib/libglib-2.0.so.0
(gdb) bt
#0  0xb7cf77ea in g_ascii_strcasecmp () from /usr/lib/libglib-2.0.so.0
#1  0x08103455 in config_node_section_index ()
#2  0x081036b0 in config_node_traverse ()
#3  0x080fb674 in settings_get_bool ()
#4  0x08090bce in command_history_init ()
#5  0x08093d81 in fe_common_core_init ()
#6  0x0805a60d in main ()

The second program I'm having issues with is the NetSurf web browser (which also compiles 100% cleanly) when built against GTK (when not built to use GTK it runs fine):

Program received signal SIGSEGV, Segmentation fault.
0xb7c1bace in g_type_check_instance_cast () from /usr/lib/libgobject-2.0.so.0
(gdb) bt
#0  0xb7c1bace in g_type_check_instance_cast () from /usr/lib/libgobject-2.0.so.0
#1  0x080cd31c in nsgtk_scaffolding_set_websearch ()
#2  0x080d05da in nsgtk_new_scaffolding ()
#3  0x080dafd8 in gui_create_browser_window ()
#4  0x0809e806 in browser_window_create ()
#5  0x080c2fa9 in ?? ()
#6  0x0807c09d in main ()

I'm 99.99% confident the issues I'm looking at are some kind of glitch-out with glib2. The rest of my system works 100% fine, just these two programs are doing weird things.
I'm similarly confident that if I tried to build other programs that used these libraries, they would quite likely fail too.

Obviously, poking glib and friends - and making even one tiny little mistake - is an instant recipe to make practically every single program in the system catastrophically break horribly (and I speak from experience with another system, long ago :P).
Given I have absolutely no idea what I'm doing with this kind of thing and I know it, I am loathe to go there; I'd like to keep my current system configuration functional :)

I was thinking of compiling a new version of glib2 (and co.), then statically linking these programs against it. I just have no idea how to do this - what steps do I need to perform?

An alternative idea I had was to ./configure --prefix=/usr; make; make install exactly the same version of glib I have right now "back into" my system, to reinstall it. I see that the associated core libraries all end with "0.3200.4":

-rwxr-xr-x 1 root root 1.4M Aug  9  2012 /usr/lib/libgio-2.0.so.0.3200.4
-rwxr-xr-x 1 root root 1.2M Aug  9  2012 /usr/lib/libglib-2.0.so.0.3200.4
-rwxr-xr-x 1 root root  11K Aug  9  2012 /usr/lib/libgmodule-2.0.so.0.3200.4
-rwxr-xr-x 1 root root 308K Aug  9  2012 /usr/lib/libgobject-2.0.so.0.3200.4
-rwxr-xr-x 1 root root 3.7K Aug  9  2012 /usr/lib/libgthread-2.0.so.0.3200.4

Would that possibly work, or break things horribly? :S
If it would possibly work, what version does "0.3200.4" translate to?

What other ideas can I try?

I'm not necessarily looking for fixes for glib itself that correct whatever fundamental error is going on - it isn't affecting me that badly. I just want to get irssi and NetSurf to run correctly.

i336_
  • 1,017

1 Answers1

2

I get the impression that if the function at the top of the list references "glib" or "gobject", you have Bad Issues(TM) with libraries that usually shouldn't go wrong.

You get the wrong impression, if you mean this indicates the flaw is probably in those libraries. It doesn't mean that; it more likely means that's where an earlier mistake finally blew up. By nature C doesn't have a lot of runtime safeguards in it, so you can easily pass arguments that will compile but aren't validated any further (unless you do it yourself). Simple example:

int main (void) {
    char whoops[3] = { 'a', 'b', 'c' };
    if (strcmp(whoops, "abcdef")) puts(whoops);

Passes an unterminated string to several different string functions. This will compile no problem, and most likely run okay because the memory violation will be very slight, but it could seg fault in strcmp() or puts(). That doesn't mean the strcmp() implementation is buggy; the mistake is clearly right there in main().

Functions like those can't logically determine if an argument passed is properly terminated (this is what I meant WRT runtime checks and C "by nature" lacking them). There's not much point in stipulating the compiler should check, because most of the time the data won't be hard coded like that.

The stuff in the middle of a backtrace doesn't necessarily play a role either, although it could. Generally the place to start looking is the last entry; that's where the problem has been traced back to.

But the bug could always be anywhere. Often comparing a backtrace to errors reported by a mem checker like valgrind can help narrow things down. WRT your examples there may be a lot to sift through though; last I checked valgrind and gtk were not happy playmates.

I was thinking of compiling a new version of glib2 (and co.), then statically linking these programs against it.

You could, although I don't see any reason to believe anything will work any better because of it. It's grasping at straws. You can't actually debug the problem yourself, which is understandable, so you consider what you could try out of desperation.

Most likely you will be just be wasting a lot of time and frustrating yourself.

I'm 99.99% confident the issues I'm looking at are some kind of glitch-out with glib2.

I'm 99% confident you are overconfident there.

While again the bug could be anywhere, as a rule of thumb, consider the most widely tested parts the least likely culprits. In this case, glib is pretty ubiquitous, whereas irssi and NetSurf are relatively obscure.

The best thing for you to do is probably file a bug report. Backtraces are usually much appreciated there. Start with irssi and NetSurf; if you go straight to glib they will, reasonably enough, just say there's no reason for them to believe it's their problem unless you can demonstrate it (which all this doesn't). If on the other hand the irssi people determine it is in glib, they'll probably want to pursue that themselves.

goldilocks
  • 87,661
  • 30
  • 204
  • 262
  • Thanks for that excellent explanation :) now I know the problem is likely to lie with the application, I'll file some bug reports! (I'd originally assumed my copy of glib was faulty because two separate applications were causing errors in glib-related libraries.) – i336_ Nov 14 '14 at 02:24