The International Simutrans Forum

 

Author Topic: Experimental on Linux  (Read 11427 times)

0 Members and 1 Guest are viewing this topic.

Offline sdog

  • Devotee
  • *
  • Posts: 2039
Re: Experimental on Linux
« Reply #35 on: May 11, 2010, 07:12:32 AM »
Is this bug still happening with -DUSE_C?  If so please rerun it with the latest devel and give me a new line number, since the old one seems a bit out of date.
it still happens with -DUSE_C.
Code: [Select]
dataobj/einstellungen.cc:982: error: cast from ‘const char*’ to ‘uint32’ loses precision

Quote
h hell, I found a logic bug in vector_tpl in experimental -- nasty off-by-one error.  Don't roll your own container classes, folks!

Patch is on the jp-devel branch of my git repo (git://github.com/neroden/simutrans)

If that doesn't fix it, or if the bug is in standard (which doesn't contain the off-by-one error) then I need two pieces of information from gdb: the output of 'print count' and 'bt'.
I pulled vector_tpl from your branch, changed eintellungen.cc:982 to unit64 and compiled it.
i'll let it run on fast time, and go a way a while.

It didn't crash. You must have fixed it.

The game froze however, after playing around in the depot a bit. No input possible, date stuck at 1 February, news ticker in bottom panel continued to run.

killed it with ^c
Code: [Select]
^C
Program received signal SIGINT, Interrupt.
0x000000000066b8e1 in colorpixcopy (dest=0x891c75e, src=0x11a8754, end=0x11a877e) at simgraph16.cc:1932
1932 while (src < end) {
(gdb) bt
#0  0x000000000066b8e1 in colorpixcopy (dest=0x891c75e, src=0x11a8754, end=0x11a877e) at simgraph16.cc:1932
#1  0x000000000066c3f7 in display_color_img_aux (sp=0x11a8730, x=389, y=471, h=19) at simgraph16.cc:2526
#2  0x000000000066cb58 in display_base_img (n=1638, xp=341, yp=410, player_nr=0 '\000', daynight=0, dirty=1)
    at simgraph16.cc:2642
#3  0x00000000004bfa79 in gui_image_list_t::zeichnen (this=0x7899bf8, parent_pos=...)
    at gui/components/gui_image_list.cc:99
#4  0x00000000004fe3a9 in gui_container_t::zeichnen (this=0x789a918, offset=...) at gui/gui_container.cc:123
#5  0x00000000004c4e2f in gui_scrollpane_t::zeichnen (this=0x7899d18, pos=...)
    at gui/components/gui_scrollpane.cc:148
#6  0x00000000004c63df in gui_tab_panel_t::zeichnen (this=0x78993a8, parent_pos=...)
    at gui/components/gui_tab_panel.cc:121
#7  0x00000000004fe3a9 in gui_container_t::zeichnen (this=0x78991f0, offset=...) at gui/gui_container.cc:123
#8  0x00000000004b6227 in gui_convoy_assembler_t::zeichnen (this=0x78991f0, parent_pos=...)
    at gui/components/gui_convoy_assembler.cc:518
#9  0x00000000004fe3a9 in gui_container_t::zeichnen (this=0x7898948, offset=...) at gui/gui_container.cc:123
#10 0x00000000004ffed5 in gui_frame_t::zeichnen (this=0x7898940, pos=..., gr=...) at gui/gui_frame.cc:166
#11 0x00000000004e5d7a in depot_frame_t::zeichnen (this=0x7898940, pos=..., groesse=...) at gui/depot_frame.cc:613
#12 0x0000000000616011 in display_win (win=0) at simwin.cc:647
#13 0x00000000006160b3 in display_all_win () at simwin.cc:670
#14 0x0000000000617843 in win_display_flush (konto=223383.38) at simwin.cc:1023
#15 0x00000000005dd956 in intr_refresh_display (dirty=false) at simintr.cc:76
#16 0x00000000006278b0 in karte_t::sync_step (this=0xeb1960, delta_t=133, sync=false, display=true)
    at simworld.cc:2792
#17 0x00000000005dda49 in interrupt_check (caller_info=0x6a0c81 "simfab 636") at simintr.cc:101
#18 0x00000000005c3fa3 in fabrik_t::step (this=0x7355440, delta_t=100) at simfab.cc:1068
#19 0x0000000000629d14 in karte_t::step (this=0xeb1960) at simworld.cc:3419
#20 0x0000000000633e46 in karte_t::interactive (this=0xeb1960, quit_month=2147483647) at simworld.cc:5725
---Type <return> to continue, or q <return> to quit---
#21 0x00000000005e6633 in simu_main (argc=1, argv=0x7fffffffe2e8) at simmain.cc:1075
#22 0x00000000006713e3 in main (argc=1, argv=0x7fffffffe2e8) at simsys_s.cc:748


ccache is used for the auto builds, according to the config.default Ansgar posted. Perhaps it's worth a try for him to compile without ccache?
« Last Edit: May 11, 2010, 07:33:20 AM by sdog »

Offline ansgar

  • *
  • Posts: 80
Re: Experimental on Linux
« Reply #36 on: May 11, 2010, 01:15:44 PM »
Ansgar: Is ccache active on the autobuild system?

Yes, but I configured ccache to use a different cache directory for amd64 and i386 (IIRC it didn't work at all before).  I just cleaned the cache anyway.

Offline neroden

  • Devotees (Inactive)
  • *
  • Posts: 831
  • Nathanael Nerode
Re: Experimental on Linux
« Reply #37 on: May 11, 2010, 02:44:03 PM »
it still happens with -DUSE_C.
Code: [Select]
dataobj/einstellungen.cc:982: error: cast from ‘const char*’ to ‘uint32’ loses precision
This was a silly coding error in experimental.  I just fixed it on my jp-devel branch.  I'm surprised it worked on 32-bit machines, as it shouldn't have!

The lockup is going to be harder to debug; the place you stopped it is working just fine (as proved by the fact that the ticker keeps running, in a single-threaded program).
« Last Edit: May 11, 2010, 02:50:54 PM by neroden »

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 18753
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Experimental on Linux
« Reply #38 on: May 11, 2010, 09:12:03 PM »
Knightly,

I'm probably the silly coder in this case. I'd be very grateful if you could release your updated version; although performance is slower, I find it acceptable enough, and I have made it optional in Experimental in any case for those who don't like the extra time that it takes to generate a map.

Offline steffen

  • *
  • Posts: 78
Re: Experimental on Linux
« Reply #39 on: May 11, 2010, 10:52:45 PM »
Ok it seems I found what causes the segfault on startup with the autobuild. It's either OPTIMISE = 1 or -DNEW_PATHING.
Here's the backtrace for a start, I'll try which of these the final culprit is now and then try it with devel. I realise now that I should've tried devel first, didn't take into account that these flags can influence which code is compiled *sigh*

Program received signal SIGSEGV, Segmentation fault.
0x000000000062997b in display_fb_internal (xp=<value optimized out>, yp=<value optimized out>, w=<value optimized out>, h=88,
    color=<value optimized out>, dirty=<value optimized out>, cL=<value optimized out>, cR=<value optimized out>, cT=0,
    cB=<value optimized out>) at simgraph16.cc:2959
2959                                    *lp++ = longcolval;
(gdb) bt
#0  0x000000000062997b in display_fb_internal (xp=<value optimized out>, yp=<value optimized out>, w=<value optimized out>, h=88,
    color=<value optimized out>, dirty=<value optimized out>, cL=<value optimized out>, cR=<value optimized out>, cT=0,
    cB=<value optimized out>) at simgraph16.cc:2959
#1  0x0000000000629aee in display_fillbox_wh (xp=44, yp=176, w=20338, h=1, color=<value optimized out>, dirty=-135966870)
    at simgraph16.cc:2990
#2  0x00000000004d6c1b in gui_frame_t::zeichnen (this=<value optimized out>, pos=..., gr=<value optimized out>) at gui/gui_frame.cc:155
#3  0x00000000004fe59c in pakselector_t::zeichnen (this=0x2c, p=..., gr=<value optimized out>) at gui/pakselector.cc:64
#4  0x00000000005aa74e in ask_objfilename (argc=1, argv=<value optimized out>) at simmain.cc:247
#5  simu_main (argc=1, argv=<value optimized out>) at simmain.cc:644
#6  0x000000000062dc67 in main (argc=1, argv=0x7fffffffdac8) at simsys_s.cc:743
(gdb) quit

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 18753
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Experimental on Linux
« Reply #40 on: May 11, 2010, 10:57:59 PM »
Steffen,

the "NEW_PATHING" preprocessor directive has long been deprecated, so that can be eliminated as a possible cause. References to it should be removed. I should also note that none of those parts of the code, so far as I am aware, have had any modifications for Experimental.

Offline steffen

  • *
  • Posts: 78
Re: Experimental on Linux
« Reply #41 on: May 11, 2010, 11:06:32 PM »
yes you're right, I found that activating NEW_PATHING lets the program run fine, whilst OPTIMISE breaks it. Now I looked in the Makefile and the only conclusion I can draw is that it's a toolchain bug. If anyone agrees I'll head over to GCC and post a bugreport.

ansgar: in the meantime, can you deactivate OPTIMISE for the amd64 autobuild?

Offline sdog

  • Devotee
  • *
  • Posts: 2039
Re: Experimental on Linux
« Reply #42 on: May 11, 2010, 11:12:36 PM »
i thought i compiled it with optimisation. since i'm not on the same machine now, i can't check wich level. -O3 would be the most likely however.

Offline steffen

  • *
  • Posts: 78
Re: Experimental on Linux
« Reply #43 on: May 11, 2010, 11:31:30 PM »
What version of GCC are you using?

Offline ansgar

  • *
  • Posts: 80
Re: Experimental on Linux
« Reply #44 on: May 12, 2010, 12:13:28 AM »
yes you're right, I found that activating NEW_PATHING lets the program run fine, whilst OPTIMISE breaks it. Now I looked in the Makefile and the only conclusion I can draw is that it's a toolchain bug. If anyone agrees I'll head over to GCC and post a bugreport.
More likely is a bug in Simutrans.  An optimizer may make certain assumptions about the code, but a program could be written in a way that these are not true.  In that case it will likely crash or give wrong results.

Quote
ansgar: in the meantime, can you deactivate OPTIMISE for the amd64 autobuild?
Sure.

Offline neroden

  • Devotees (Inactive)
  • *
  • Posts: 831
  • Nathanael Nerode
Re: Experimental on Linux
« Reply #45 on: May 12, 2010, 03:48:45 AM »
Knightly,

I'm probably the silly coder in this case. I'd be very grateful if you could release your updated version;
James, I think I'm the one who made the 'silly coding error' comment, and you've already fixed it by pulling from my jp-devel branch.  Isn't git wonderful?  (Says the new convert who wouldn't use it two months ago!)

More likely is a bug in Simutrans.  An optimizer may make certain assumptions about the code, but a program could be written in a way that these are not true.  In that case it will likely crash or give wrong results.

One example which would work differently in 64-bit and 32-bit arithmetic is bit-twiddling with & and |  -- the bitmasks might be right for 32-bit, wrong for 64-bit.  There are a bunch of other things like this.

Offline steffen

  • *
  • Posts: 78
Re: Experimental on Linux
« Reply #46 on: May 12, 2010, 04:10:30 AM »
This reminds me why I use Python for my own programs lol.
In any case I found another reproducable segfault, but I'll post it as a new thread since it appears to be a different issue.

And yes, git is brilliant :)

Offline sdog

  • Devotee
  • *
  • Posts: 2039
Re: Experimental on Linux
« Reply #47 on: May 12, 2010, 05:37:14 AM »
python...  to-morrow i have to look at my fortran 77 code all day long.

you let me dream of tropical paradises with pythons and girls wearing perls and rubies, and certainly weren't named ada.


i'm pretty thankfull james decided to use git, having to learn svn is something i gladly pass for more joyfull things. well, about twenty million less joyfull things spring to my mind though. all are related to horrible deaths, root canal treatments, north-american coffee or having smalltalk with ada.

Offline sdog

  • Devotee
  • *
  • Posts: 2039
Re: Experimental on Linux
« Reply #48 on: May 19, 2010, 02:00:09 AM »
since i don't want to cp the file after building to my simutrans dir, i just put a symlink on the build in the development directory. it doesn't work. simutrans did not look in my pwd for the pakfile, and exited when i couldn't find one.

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 18753
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Experimental on Linux
« Reply #49 on: May 19, 2010, 10:18:16 AM »
Sdog,

I don't think that this is a Simutrans-Experimental specific issue.

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 9584
  • Languages: De,EN,JP
Re: Experimental on Linux
« Reply #50 on: May 19, 2010, 02:34:21 PM »
Simutrans uses the programm directory, where you copied it. If it should use the current directory instead, it must be called with "-use_workdir", see the readme.

Offline sdog

  • Devotee
  • *
  • Posts: 2039
Re: Experimental on Linux
« Reply #51 on: May 19, 2010, 03:27:00 PM »
thanks prissi, i already expected i made a mistake when compiling


@james, yes it isn't but it is to unimportant to warrant a new thread.

Offline steffen

  • *
  • Posts: 78
Re: Experimental on Linux
« Reply #52 on: May 28, 2010, 10:13:54 PM »
Ok well I definitely can't reproduce that segfault I was getting anymore, either an update or my world recompile fixed it :)