9.0.1 ChangeLog-9.0.1 (11/20/2025)
9.0.0 ChangeLog-9.0.0 (04/25/2025)
8.0.6 ChangeLog-8.0.6 (11/12/2024)
8.0.5 ChangeLog-8.0.5 (04/24/2024)
8.0.4 ChangeLog-8.0.4 (11/16/2023)
8.0.3 ChangeLog-8.0.3 (04/26/2023)
8.0.2 ChangeLog-8.0.2 (11/17/2022)
8.0.1 ChangeLog-8.0.1 (04/26/2022)
7.3.2 ChangeLog-7.3.2 (04/26/2022)
8.0.0 ChangeLog-8.0.0 (11/24/2021)
7.3.1 ChangeLog-7.3.1 (11/24/2021)
7.3.0 - Add support for new lockless righbuffer that Linux 5.10 introduced.
Without the two patches, crash fails during session initialization
or "log" command fails with the error message:
crash: cannot determine length of symbol: log_end
(john.ogness@linutronix.de, nborisov@suse.com, k-hagio-ab@nec.com)
- Add support for VC exception stack on x86_64 Linux 5.10 and later
kernels that contain commit 02772fb9b68e ("x86/sev-es: Allocate and
map an IST stack for #VC handler").
(amakhalov@vmware.com)
- Fix regression for raw RAM dumpfiles. Commit f42db6a33f0e ("Support
core files with "unusual" layout") increased the minimal file size
from MIN_NETDUMP_ELF_HEADER_SIZE to SAFE_NETDUMP_ELF_HEADER_SIZE
which can lead to crash rejecting raw RAM dumpfiles. Without the
patch, the crash fails to start a session with the error message:
/var/tmp/ramdump_elf_XXXXXX: ELF header read: No such file or directory
crash: malformed ELF file: /var/tmp/ramdump_elf_XXXXXX
(zhaoqianli@xiaomi.com)
- Update mapping symbol filter in arm64_verify_symbol() to support the
long form of mapping symbols, e.g. "$x.<any...>". Without the
patch, the "dis" command cannot completely parse out the disassembly
of a function that has mapping symbols in the long form and misses
the tail part of the function.
(zhaoqianli@xiaomi.com)
- Move extensins/Makefile's ping check to recipe script. Without this
patch, in an environment where ping to github.com does not work,
"make clean" at the top-level crash directory always takes about 10
seconds unnecessarily.
(k-hagio-ab@nec.com)
- Fix for a segmentation fault when analyzing arm64 kernels that are
configured with CONFIG_IKCONFIG and have a strange entry that does
not contain the delimiter "=", such as "CONFIG_SECU+[some hex data]".
Without the patch, in the add_ikconfig_entry() function, strtok_r()
interprets it as consisting of a single token and the val variable
is set to NULL, and then strdup() crashes.
(liuyun01@kylinos.cn)
- Fix a couple of issues that were detected by valgrind.
(d.hatayama@fujitsu.com)
- Add ability to un-set scope. The ability can come in very useful
when running automated pykdump scripts and needing scope to be
cleared between script runs.
(jpittman@redhat.com)
- Fix "sys [-t]|mod -S" after "mod -t" when crash runs with -s option.
Without the patch, the "sys [-t]" and "mod -S" options after "mod -t"
option fail with the error message:
sys: invalid structure member offset: tnt_false
FILE: kernel.c LINE: 11203 FUNCTION: show_kernel_taints_v4_10()
(k-hagio-ab@nec.com)
- Fix for "dev -d" option on Linux 5.11-rc1 and later kernels that
contains commit 0d02129e76edf91cf04fabf1efbc3a9a1f1d729a ("block:
merge struct block_device and struct hd_struct"). Without the patch,
the option fails with the error message:
dev: invalid structure member offset: hd_struct_dev
(k-hagio-ab@nec.com)
- Fix for "kmem -v" option on Linux 5.11-rc1 and later kernels that
contain commit 96e2db456135db0cf2476b6890f1e8b2fdcf21eb ("mm/vmalloc:
rework the drain logic"). Without the patch, the option will display
nothing or fail with the error message:
kmem: invalid kernel virtual address: <address> type: "vmlist addr"
(k-hagio-ab@nec.com)
- Add the base address of module to "mod" command output. Currently
the command shows the address of the module struct, but it is
inconvenient to know the address range of the module, so extend to
show the base adddress.
(yeyunfeng@huawei.com, k-hagio-ab@nec.com)
- Increase the value of __PHYSICAL_MASK_SHIFT_XEN to 52. The former
value of __PHYSICAL_MASK_SHIFT_XEN in crash (40) is smaller than the
kernel (52) since kernel commit 6f0e8bf167 (xen: support 52 bit
physical addresses in pv guests). This can cause x86_64_pud_offset()
to lose the most significant bits of pgd_pte, leading to a failed
xen_m2p() translation, resulting in crash failing with an error
message like this:
crash: read error: physical address: ffffffffffffffff type: "pud page"
(jbohac@suse.cz)
- Change log level print in older kernels. In older kernels that have
the variable-length-record log_buf, the log level and the log
flags/facility are not separated. Since the log level is only the
last three bits, and the flags/facility and level are separated in
5.10 and later kernels, only print those last three bits when using
'log -m'.
(jpittman@redhat.com)
- Reduce crash build log. The verbose output of tar command when
extracting the GDB source files occupies more than the half of crash
build log. It is not so helpful and makes the build log longer
needlessly especially on CI build test without the patch.
(k-hagio-ab@nec.com)
- Fix for "bt" command on Linux 5.12-rc1 and later x86_64 kernels that
contain commit 951c2a51ae75 ("x86/irq/64: Adjust the per CPU irq
stack pointer by 8"). Without the patch, the "bt" command and some
of its options that read irq stack fail with the error message:
bt: read of stack at <address> failed".
(k-hagio-ab@nec.com)
- Add valgrind support for the crash's custom memory allocator. This
helps detecting various memory errors on the crash's custom memory
allocator.
(d.hatayama@fujitsu.com)
- Fix for a couple of invalid read/write issues detected by valgrind.
(d.hatayama@fujitsu.com)
- Fix "struct" command to print member array of list_heads correctly.
Without the patch, due to the way that an array of list_head entries
are printed, parsing of them fails and the command does not print
anything:
crash> struct blk_mq_ctx.rq_completed ffffc447ffc0f740
crash>
(jpittman@redhat.com)
- Do not pass through 'sy' command to GDB. The GDB 'symbol-file'
command is prohibited in the crash utility, but an abbreviation of
it, the 'sy' is not prohibited. This can discard symbol table from
the current symbol file, and eventually caused the failure of crash
utility after executing the 'sys' command as below:
crash> sy
Discard symbol table from `/path/to/vmlinux'? (y or n) Please answer y or n.
Discard symbol table from `/path/to/vmlinux'? (y or n) No symbol file now.
crash> sys
double free or corruption (!prev)
Aborted (core dumped)
(lijiang@redhat.com)
- Refine zram related code for crash gcore command to support it.
(d.hatayama@fujitsu.com)
- Fix for the failure of 'set scope' command. Without the patch,
some commands such as 'sys' may cause subsequent 'set scope'
commands to fail.
(lijiang@redhat.com)
- Fix for offset print for function pointers that return pointers.
In the show_member_offset() function, when trying to handle function
pointers, the case for "(*" is handled. However, if the function
pointer returns a pointer or a pointer to a pointer, then the
condition is unhandled. This results in the offset not being
printed without the patch, for example:
crash> struct -o offload_callbacks
struct offload_callbacks {
struct sk_buff *(*gso_segment)(struct sk_buff *, netdev_features_t);
struct sk_buff **(*gro_receive)(struct sk_buff **, struct sk_buff *);
[16] int (*gro_complete)(struct sk_buff *, int);
}
(jpittman@redhat.com)
- Change functions within extensions/echo.c to be static and document
the issue in code comments, for extensions developers who takes
echo.c as reference, to avoid the issue that symbols in extension
modules are overwritten by former loaded one if it's the same name.
(ltao@redhat.com)
- Fix for 'bt' command and options on Linux 5.8-rc1 and later x86_64
kernels that contain merge commit 076f14be7fc9. The merged patches
changed the name of exception functions that have been used by the
crash utility to check the exception frame. Without the patch, the
command and options cannot display it.
(k-hagio-ab@nec.com)
- Fix for xen kernels that contain commit edcb5cf84f05
("x86/paravirt/xen: Remove xen_patch()"). Withouth the patch,
crash fails with an error message like this:
crash: seek error: physical address: 83640e000 type: "pud page"
(john.p.donnelly@oracle.com, k-hagio-ab@nec.com)
- Remove extensions/trace.c file, as the extension module moved to
the separate repository from the crash repository.
(k-hagio-ab@nec.com)
- Fix for uvtop conversion on ARM with LPAE. Without the patch,
arm_uvtop() calls arm_lpae_vtop() with the LPAE and it can use
LPAE_VTOP() also for a user virtual address. As a result, commands
that use uvtop conversion such as "ps -a", "gcore" fail as readmem()
for a uvaddr returns a seek error:
ps: cannot access user stack address: <address>
(k-hagio-ab@nec.com)
- Handle 1GB block for VM_L3_4K on arm64 architecture. Without the
patch, "vtop" command cannot display the block as a 1GB hugepage.
(johan.erlandsson@sony.com)
- Implement initial support for the MIP64 architecture.
(tangyouling@loongson.cn, chenhuacai@loongson.cn)
- Fix for HZ calculation using cfq_slice_async on Linux 4.8 and later
kernels that contain commit 9a7f38c42c2b ("cfq-iosched: Convert from
jiffies to nanoseconds"). Without the patch, the HZ calculation
results in a wrong and big value for machde->hz and crash can shows
a wrong uptime and timestamps in "log -T".
(martin.moore@hpe.com)
- Fix for HZ calculation on Linux 4.8 and later kernels that contain
commit 9a7f38c42c2b ("cfq-iosched: Convert from jiffies to
nanoseconds"). Without the patch, the HZ value can be set to a
hardcorded wrong value.
(k-hagio-ab@nec.com)
(04/27/21)
7.2.9 - Fix for an ARM64 gcc-10 compilation error. Without the patch, the
build of the embedded gdb module fails with an error message that
indicates "multiple definition of 'tdesc_aarch64'".
(anderson@redhat.com)
- Fix for the "log" command. Without the patch, the command's output
may be truncated, ending with the error message "log: invalid log_buf
entry encountered".
(chenqiwu@xiaomi.com)
- Fix to allow the translation of ARM64 FIXMAP addresses located in
the virtual memory region between the end of the vmalloc region and
the beginning of the vmemmap region. Without the patch, reads of
virtual addresses within that region are not recognized properly
and will fail.
(zhaoqianli@xiaomi.com)
- Introduction of a new "extend -s" option, which shows all available
shared object extension modules that are located in the directories
that are part of the normal search path that is used when a shared
object is loaded without a fully-qualified pathname.
(w@laoqinren.net)
- Fix for the "bpf -m|-M" options on Linux 5.3 and later kernels that
contain commit 3539b96e041c06e4317082816d90ec09160aeb11, titled
"bpf: group memory related fields in struct bpf_map_memory". Without
the patch, the options prints "(unknown)" for MEMLOCK and UID.
(k-hagio-ab@nec.com)
- Enhancement to the "bpf -p|-P" options to display the eBPF program
name string.
(k-hagio-ab@nec.com)
- Fix for reading compressed kdump dumpfiles from systems with physical
memory located at extraordinarily high addresses. In a system with
a physical address range from 0x602770ecf000 to 0x6027ffffffff, the
crash utility fails during session initialization due to an integer
overflow, ending with the error message "crash: vmlinux and vmcore
do not match!".
(chenjialong@huawei.com)
- Enhancement of the "struct -r" option to support the raw memory
display of a single data structure member. Without the patch, the
option only supported the raw display of a complete data structure.
(asmadeus@codewreck.org)
- Modify the display behavior of the "struct -r" option so as to scale
the minimum display size from the size of a per-architecture long
(32-bits or 64-bits) down to 8-bits, 16-bits or 32-bits when the
requested size is equal to one of the smaller sizes.
(asmadeus@codewreck.org)
- Introduce a new ARM64 "--machdep vabits_actual=<value>" command
line option for Linux 5.4 and later dumpfiles, which require the
kernel's dynamically-determined "vabits_actual" value for virtual
address translation. Without the patch, the crash session fails
during initialization with the error message "crash: cannot determine
VA_BITS_ACTUAL". This option will become unnecessary when the
proposed TCR_EL1.T1SZ vmcoreinfo entry is incorporated into the
kernel.
(anderson@redhat.com)
- Fix for "kmem -[sS]" options on Linux 4.14 and later kernels built
with CONFIG_SLAB_FREELIST_HARDENED enabled. Without the patch, there
will error messages of the type "kmem: <cache name> slab: <address>
invalid freepointer: <obfuscated address>" for caches created during
SLUB bootstrap, as they are likely to have s->random == 0.
(hbathini@linux.ibm.com)
- If readmem() receives a user-space address in a page that has been
swapped to the zswap compressed swap cache, an attempt will be made
to find and decompress the page.
(zhaoqianli@xiaomi.com)
- Fix for the "mount -n [pid|task]" option when running on a live
system. Without the patch, if the [pid|task] has been created since
the last internal task table refresh, the command fails with the
error message "mount: invalid task or pid value: <value>".
(w@laoqinren.net)
- Introduction of the "log -T" option, which translates the leading
timestamp value of each message into human readable format.
(w@laoqinren.net)
- When kernels are built with LLVM, the names of many symbols may be
appended with an ".llvm.<number>" string. As a result, commands
such as "irq" fail with the error message irq: neither irq_desc,
_irq_desc, irq_desc_ptrs or irq_desc_tree symbols exist". This
patch adds the LLVM-generated string to the other strings that are
stripped from symbols before they are stored.
(zhaoqianli@xiaomi.com)
- Prepare for the introduction of ARM64 8.3 Pointer Authentication
as in-kernel feature. The value of CONFIG_ARM64_KERNELPACMASK
will be exported as a vmcoreinfo entry, and will be used with text
return addresses on the kernel stack.
(amit.kachhap@arm.com)
- Several fixes for ARM64 kernels:
(1) Linux kernel patch "arm64: mm: Introduce vabits_actual"
introduced "physvirt_offset", which is not equal to
(PHYS_OFFSET - PAGE_OFFSET) when KASLR is enabled.
physvirt_offset is caculated in arch/arm64/mm/init.c
before memstart_addr (PHYS_OFFSET) is randomized. Let
arm64_VTOP() and arm64_PTOV() use physvirt_offset instead,
whose default value is set to (phys_offset - page_offset)
(2) For ARM64 RAM dumps without any vmcoreinfo and KASLRpassed as
argument, " _stext_vmlinux" is not set. This causes incorrect
calculation of vmalloc_start with VA_BITS_ACTUAL.
(3) For ARM64 RAM dumps For ramdumps without vmcoreinfo, get
CONFIG_ARM64_VA_BITS from in-kernel config. Without this,
vmemmap size is calculated incorrectly.
(4) Fix the vmemmap_start to match with what the kernel uses.
(vinayakm.list@gmail.com)
- Replace people.redhat.com references with github equivalents.
(anderson@redhat.com)
- Implement support for user-space zram reads on x86_64 for recent
Fedora kernel version 5.6.7-200.fc31. The patch adds the following:
(1) Redefine _PFN_BITS() macro to use MAX_POSSIBLE_PHYSMEM_BITS.
(2) Fix to determine whether address_space.i_pages is a radix tree
or an xarray.
(3) Fix to not mistakenly select the "lzo" compressor when the
kernel has used the default "lzo-rle" compressor.
(4) Since zram may be provided as a kernel module, it would be
necessary to load its debuginfo during the crash session;
therefore perform the zram structure-size/member-offset
initializations when first required instead of during
session initialization.
(5) Handle the zram_table_entry structure member name change
from "value" to "flags".
(d.hatayama@jp.fujitsu.com)
- Add support for 1GB huge pages to "vtop" command on x86_64. Without
this patch, the command with a user virtual address corresponding to
a 1GB huge page fails with the error message "vtop: seek error:
physical address: <address> type: "page table".
(lirongqing@baidu.com, chukaiping@foxmail.com)
- Fix six spelling typos in help.c.
(standby24x7@gmail.com)
- Change tcr_el1_t1sz vmcoreinfo entry name to TCR_EL1_T1SZ according
to kernel commit bbdbc11804ff ("arm64/crash_core: Export TCR_EL1.T1SZ
in vmcoreinfo").
(bhsharma@redhat.com)
- Fix for a failure of calculating kaslr_offset due to an sadump format
restriction. Without the patch set, calculating kaslr_offset fails
because it is based on the assumption that unused part of register
values in the sadump format are always zero cleared.
(d.hatayama@fujitsu.com)
- Support for huge holes in vmem of VMware VMSS dumpfiles. Without the
patch, if the hole is big enough, the multiplication by page size
will truncate as it's operating on a uint32_t.
(minipli@grsecurity.net)
- Beautify and extend debug log for VMware VMSS dumpfiles. Without the
patch, the parser's debug log is missing a few line breaks as well as
some crucial information, like control register dumps.
(minipli@grsecurity.net)
- Support core files with unusual layout that the ELF program headers
do not directly follow the ELF header, such as vmcores generated with
'vmss2core' tool.
(minipli@grsecurity.net)
- Fix for the "log -T" option when crash is started with "--minimal"
option. Without the patch, crash will spin at 100% and continuously
crash at a divide by zero. Disallow the option in minimal mode.
(dwysocha@redhat.com)
- Remove raw-view from s390bpf. With kernel commit ecb1ff6833c4
("s390/debug: remove raw view"), the raw-view is no longer supported
by s390 debug feature. Since there has never been a single user of
the raw-view, remove it from crash as well.
(zaslonko@linux.ibm.com)
- Support s390 debug feature version 3, which was introduced by kernel
commit 0990d836cecb ("s390/debug: debug feature version 3").
(zaslonko@linux.ibm.com)
- Basic support for PaX's split module layout. PaX and grsecurity
kernels split module memory into dedicated r/x and r/w mappings using
'*_rw' and '*_rx' named member variables in 'struct module'. To add
basic support for such kernels, detect the split layout by testing
for the corresponding structure members and use these instead.
(minipli@grsecurity.net)
- Fix for the "kmem -i" option on Linux 5.9-rc1 and later kernels that
contain commit 1008fe6dc36d ("block: remove the all_bdevs list").
Without the patch, the option fails halfway with the error message
'kmem: cannot resolve: "all_bdevs"'.
(k-hagio-ab@nec.com)
- Fix for the "irq -a" option on Linux 4.3 or later kernels that
contain commit 9df872faa7e1 ("genirq: Move field 'affinity' from
irq_data into irq_common_data"). Without the patch, the option
cannot work with the message "irq: -a option not supported or
applicable on this architecture or kernel".
(k-hagio-ab@nec.com)
- Append time zone explicitly to each output of date and time like
"DATE: Thu Nov 29 06:44:02 JST 2018".
(k-hagio-ab@nec.com)
- Fixes for the "trace.so" extension module on Linux 5.6 and later
kernels that contain commit:
(1) 1c5eb4481e01 ("tracing: Rename trace_buffer to array_buffer")
(2) 13292494379f ("tracing: Make struct ring_buffer less ambiguous")
With the patch set, rename trace_buffer to array_buffer and
ring_buffer to trace_buffer respectively.
(valentin.schneider@arm.com)
- Fix for the "help -D" option listing uninteresting register entries
for SADUMP dumpfiles.
(d.hatayama@fujitsu.com)
- Fix for an initialization-time failure due to offset change of the
name member of struct uts_namespace that might be introduced by
linux-next commit 9a56493f6942 ("uts: Use generic ns_common::count").
(egorenar@linux.ibm.com)
- Add support for VMware guestdump (debug.guest) and vmem (debug.vmem)
files. To use, the companion debug.vmem file must be present in the
same directory as the debug.guest file.
(amakhalov@vmware.com)
- Fix for the "extend" command on a PPC64 targeted x86_64 crash binary.
Without the patch, the command on an x86_64 crash binary that can be
used to analyze ppc64le dumpfiles fails with the error message
"extend: <path to extension>: not an ELF format object".
(aeasi.linux@gmail.com, k-hagio-ab@nec.com)
- Fix for a failure to match arm/aarch64 ELF format of xendump file.
(goodbach@gmail.com)
- Fix for the x86_64 "bt" command in cases where the pt_regs is not
present in the stack. Without the patch, the command can be
incomplete with the error message 'bt: seek error: kernel virtual
address: <address> type: "pt_regs"'.
(dmair@suse.com)
- Fix for the crash.ko memory driver build with Linux 5.8 and later
kernels that contain commit fe557319aa06 ("maccess: rename
probe_kernel_{read,write} to copy_{from,to}_kernel_nofault").
Additionally, due to commit 0493cb086353 ("maccess: unexport
probe_kernel_write()"), writing kernel memory is no longer possible
from a module. Without this patch, build with the kernels fails
with the error message "error: implicit declaration of function
'probe_kernel_write'".
(ptesarik@suse.com)
- Fix for the memory_driver/Makefile for Linux 5.4 and later kernels
that contain commit 7e35b42591c0 ("kbuild: remove SUBDIRS support").
Without the patch, the "make" command in the memory_driver directory
doesn't build crash memory driver module as expected.
(k-hagio-ab@nec.com)
- Improvements of KASLR offset detection for QEMU, VMware VMSS and
SADUMP dumpfiles:
(1) Try all CPUs to provide CR3 and IDTR, because these registers
on CPU0 can be not initialized or clobbered.
(2) Support 5-level page table by using LA57 bit in CR4.
(3) Get KASLR offset by walking page tree.
(amakhalov@vmware.com)
- Fix for an initialization-time failure with QEMU dumpfiles with Linux
5.8 and later x86_64 kernels that contain commit 9d06c4027f21
("x86/entry: Convert Divide Error to IDTENTRY"), renamed divide_error
handler to asm_exc_divide_error.
(nborisov@suse.com)
- Fix for several compiler warnings on 32-bit architectures when
building with "make warn". Without the patch, gcc generates the
message "warning: format '%ld' expects argument of type 'long int',
but argument 4 has type 'uint64_t' [-Wformat=]" and similar ones as
a result of crash commit 3fedbee9bfbb ("vmware_guestdump: new input
format").
(k-hagio-ab@nec.com)
- Speed up session initialization by avoiding unnecessary processing
in the stkptr_to_task() function when sp is 0 on some architectures.
Without the patch, as it runs through each task's stack to find
whether the given address is in its range, on a system with about
1500 CPUs and 165k running tasks, it takes about a day to finish
session initialization. With the patch applied, it only takes about
5-10 minutes.
(hbathini@linux.ibm.com)
(11/20/20)
7.2.8-2.fc32
Information for build crash-7.2.8-2.fc32
https://koji.fedoraproject.org/koji/buildinfo?buildID=1455371
7.2.8 - Fix for Linux 5.4-rc1 and later kernels that contain commit
688fcbfc06e4fdfbb7e1d5a942a1460fe6379d2d, titled "mm/vmalloc:
modify struct vmap_area to reduce its size". Without the
patch "kmem -v" will display nothing; other architectures
that utilize the vmap_area_list to determine the base of
mapped/vmalloc address space will fail.
(anderson@redhat.com)
- Fix for Linux 5.4-rc1 and later kernels that contain commit/merge
e0703556644a531e50b5dc61b9f6ea83af5f6604, titled "Merge tag 'modules-
for-v5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux
which introduces symbol namespaces. Without the patch, and depending
upon the architecture:
(1) the kernel module symbol list will contain garbage entries
(2) the session fails during session initialization with a dump of
the internal buffer allocation stats followed by the message
"crash: cannot allocate any more memory!"
(3) the session fails during session initialization with a
segmentation violation.
(anderson@redhat.com)
- Fix for the "timer -r" option on Linux 5.4-rc1 and later kernels
that contain commit 511885d7061eda3eb1faf3f57dcc936ff75863f1, titled
"lib/timerqueue: Rely on rbtree semantics for next timer". Without
the patch, the option fails with the following error "timer: invalid
structure member offset: timerqueue_head_next".
(k-hagio@ab.jp.nec.com)
- Fix for a "[-Wstringop-truncation]" compiler warning emitted when
symbols.c is built in a Fedora Rawhide environment with gcc-9.0.1
or later.
(anderson@redhat.com)
- Fix for the "kmem -n" option on Linux-5.4-rc1 and later kernels that
contain commit b6c88d3b9d38f9448e0fcf44847a075ea81d5ca2, titled
"drivers/base/memory.c: don't store end_section_nr in memory blocks".
Without the patch, the command option fails with the error message
"kmem: invalid structure member offset: memory_block_end_section_nr".
(msys.mizuma@gmail.com)
- Fix for Linux 4.19.5 and later 4.19-based x86_64 kernels which
are NOT configured with CONFIG_RANDOMIZE_BASE and have backported
kernel commit d52888aa2753e3063a9d3a0c9f72f94aa9809c15, titled
"x86/mm: Move LDT remap out of KASLR region on 5-level paging",
which modified the 4-level and 5-level paging PAGE_OFFSET values.
Without this patch, the crash session fails during initialization
with the error message "crash: seek error: kernel virtual address:
<address> type: "tss_struct ist array".
(anderson@redhat.com)
- Additional fix for the "kmem -n" option on Linux-5.4-rc1 and later
kernels that contain commit b6c88d3b9d38f9448e0fcf44847a075ea81d5ca2,
titled "drivers/base/memory.c: don't store end_section_nr in memory
blocks". The initial fix only addressed the x86_64 architecture;
this incremental patch addresses the other architectures.
(msys.mizuma@gmail.com)
- In the unlikely event that the panic task in a dumpfile cannot be
determined by the normal means, scan the kernel log buffer for panic
keywords, and if found, generate the panic task from the CPU number
that is specified following the panic message.
(chenqiwu@xiaomi.com)
- Adjust a crash-7.1.8 patch for support of /proc/kcore as the live
memory source in Linux 4.8 and later x86_64 kernels configured with
CONFIG_RANDOMIZE_BASE, which randomizes the unity-mapping PAGE_OFFSET
value. Since the problem only arises before the determination of the
randomized PAGE_OFFSET value, restrict the patch such that it only
takes effect during session initialization.
(anderson@redhat.com)
- Add support for extended numbering support in ELF dumpfiles to handle
more than PN_XNUM (0xffff) program headers. If the real number of
program header table entries is equal to or greater than PN_XNUM, the
e_phnum field of the ELF header is set to PN_XNUM, and the actual
number is set in the sh_info field of the section header at index 0.
(k-hagio@ab.jp.nec.com)
- Fix for a "warning: large integer implicitly truncated to unsigned
type [-Woverflow]" compiler message generated on 32-bit architectures
as a result of the "Additional fix for the kmem -n option" patch
above.
(anderson@redhat.com)
- Add support for handling openSUSE vmlinux files which will be shipped
in .xz compressed format. Without the patch, only gzip and bzip2
formats are supported.
(jirislaby@gmail.com)
- Fix for the determination of the ARM64 page size on Linux 4.4 and
earlier kernels that do not have vmcoreinfo data. Without the patch,
the crash session fails during initialization with the error message
"crash: "cannot determine page size".
(chenqiwu@xiaomi.com)
- Determine the ARM64 kernel's "vabits_actual" value by reading the
new TCR_EL1.T1SZ vmcoreinfo entry.
(bhsharma@redhat.com)
- Fix to determine the ARM64 kernel's "vabits_actual" value from the
ELF header of a dumpfile created with the "snap.so" extension module.
(anderson@redhat.com)
- Fix two typos in the examples section of the "help bt" display, which
mistakenly show "bf -f" and "bf -FF" instead of "bt -f" and "bt -FF".
(austindh.kim@gmail.com)
- Similar to ARM64, the X86_64, PPC64 and S390x architectures will use
the exported value of MAX_PHYSMEM_BITS from the vmcoreinfo data as
the preferred method if it is available.
(anderson@redhat.com)
- If an S390X kernel crashes before vmcoreinfo initialization, there is
no way to extract the KASLR offset for such early dumps. In a new
S390X kernel patch, the KASLR offset will be stored in the lowcore
memory during early boot and then overwritten after vmcoreinfo is
initialized. This patch allows crash to identify the KASLR offset
that is stored in the lowcore memory.
(zaslonko@linux.ibm.com)
- Fix for a crash-7.2.7 regression that determined the value of the
ARM64 kernel SECTION_SIZE_BITS by reading the in-kernel configuration
data if there is no VMCOREINFO data available. In that case, without
the patch, a double-free exception may occur.
(anderson@redhat.com)
- Fix for segmentation violation if the gdb_readmem_callback() function
gets called from other than a crash command, such as from an epython
command from the mypkdump.so extension module.
(anderson@redhat.com)
- Fix for the "dis -s" option when running against kernels that have
been configured with CONFIG_RANDOMIZE_BASE=y (KASLR). Without the
patch, the command option indicates that the FILE and LINE numbers
are "(unknown)", and that "source code is not available".
(anderson@redhat.com)
- Fix for newer Xen hypervisors, which fail during initialization with
the error message "crash: cannot resolve init_tss". This is caused
by a change in the Xen hypervisor with commit 78884406256, from
4.12.0-rc5-763-g7888440625. In that patch the tss_struct structure
was renamed to tss64 and the tss_page structure was introduced,
which contains a single tss64. Now tss information is accessible
via the symbol "per_cpu__tss_page".
(dietmar.hahn@ts.fujitsu.com)
- When accessing the ARM64 kernel's "crash_notes" array, continue to
read the per-cpu NT_PRSTATUS note contents if an invalid note is
encountered. Without the patch, if an invalid note is found, all
other notes were ignored, and subsequent "bt" attempts on the active
tasks would fail.
(chenqiwu@xiaomi.com, anderson@redhat.com)
- When accessing the 32-bit ARM kernel's "crash_notes" array, continue
to read the per-cpu NT_PRSTATUS note contents if an invalid note is
encountered. Without the patch, if an invalid note is found, all
other notes were ignored, and subsequent "bt" attempts on the active
tasks would fail.
(chenqiwu@xiaomi.com, anderson@redhat.com)
- Fix for the "log -a" option. The kernel's sk_buff.len field is a
32-bit unsigned int, but crash was reading its 32-bit value into a
64-bit unsigned long stack variable. All extra bits that pre-existed
in the upper 32-bits of the stack variable were passed along as part
of a buffer size request; if the upper 32-bit bits were non-zero,
then the command would fail with a dump of the internal buffer
allocation stats followed by the message "log: cannot allocate any
more memory!".
(anderson@redhat.com)
- When determining the ARM64 kernel's "vabits_actual" value by reading
the new TCR_EL1.T1SZ vmcoreinfo entry, display its value during
session initialization only when invoking crash with "-d1" or larger
-d debug value.
(anderson@redhat.com)
- Update copyright to 2020 in crash version output.
(anderson@redhat.com)
- Fix for ARM64 when running against Linux 5.5-rc1 and later kernels
that contain commit b6e43c0e3129ffe87e65c85f20fcbdf0eb86fba0, titled
"arm64: remove __exception annotations". Without the patch, the
ARM64 crash session fails during initialization with the error
message "crash: cannot resolve __exception_text_start".
(anderson@redhat.com)
- Fix for support of ELF format kdump vmcores from S390X KASLR kernels.
Without the patch, the crash session fails during initialization with
the error message "crash: vmlinux and vmcore do not match!".
(anderson@redhat.com)
- Fix for support of S390X standalone dumpfiles and LKCD dumpfiles that
were taken from S390X KASLR kernels.
(zaslonko@linux.ibm.com)
- Rework the previous patch for support of S390X standalone dumpfiles
and LKCD dumpfiles that were taken from S390X KASLR kernels to avoid
calling an s390x-specific function from generic code.
(zaslonko@linux.ibm.com)
- Fix for a gcc-10 compilation error. Without the patch, the build of
the crash library fails with a stream of error messages indicating
"multiple definition of 'diskdump_flags'"
(anderson@redhat.com)
(01/30/20)
7.2.7-1.fc32
Information for build crash-7.2.7-1.fc32
https://koji.fedoraproject.org/koji/buildinfo?buildID=1388374
7.2.7 - Document the "-N", "-g" and "-z" options in the "help" command's
help page.
(k-hagio@ab.jp.nec.com)
- Fix for a crash-7.2.6 regression to the "p" command. Without the
patch, a gdb pass-through command construct such as:
p ((struct zone *)0xffff901e3ffda000)->min_slab_pages
gets parsed incorrectly, and the "-" is mistaken for an argument
option, and each of the subsequent characters are marked as an
"invalid option".
(dwysocha@redhat.com)
- Export the get_mount_list() and get_dump_level() functions in defs.h
for use by extension modules.
(k-hagio@ab.jp.nec.com)
- Change the gating of a debug message in the do_xarray_dump_cb()
function from CRASHDEBUG(0) to CRASHDEBUG(1). Without the patch,
users of the XArray callback functionality may see messages of the
sort "entry has XARRAY_TAG_MASK bits set: 239ab0024001" without
setting a debug number.
(anderson@redhat.com)
- Fix for Linux 5.2 and later x86_64 kernels that contain kernel commit
e6401c13093173aad709a5c6de00cf8d692ee786, titled "x86/irq/64: Split
the IRQ stack into its own". Without the patch, the per-cpu IRQ
stack addresses cannot be determined, and as a result backtraces
that utilize an IRQ stack will fail.
(anderson@redhat.com)
- Fix to allow live system analysis of s390x kernels that have been
configured with CONFIG_RANDOMIZE_BASE=y (KASLR). Without the patch,
the "--kaslr=<offset>" command line option is required.
(anderson@redhat.com)
- Fix for Linux 5.2 and later x86_64 kernels that contain kernel commit
019b17b3ffe48100e52f609ca1c6ed6e5a40cba1, titled "x86/exceptions: Add
structs for exception stacks". Without the patch, the exception
stack sizes cannot be determined, and as a result backtraces
that initiate from an exception stack will fail with error messages
indicating "bt: invalid kernel virtual address: <address> type:
stack contents" and then "bt: read of stack at <address> failed".
(anderson@redhat.com)
- Two fixes for the "sys -c" option, one that significantly shortens
the time consumed by the option, and a second fix that addresses
occasional situations where the file and line number data are not
displayed.
(k-hagio@ab.jp.nec.com)
- Fix for a signed/unsigned comparison bug in vmcoreinfo_read_string()
which could lead to a segmentation violation in the highly unlikely
event of a zero length or severely truncated VMCOREINFO note.
(nudasnev@microsoft.com)
- Fix for the determination of the ARM64 "kimage_voffset" value
in Linux 4.6 and later kernels if an ELF format dumpfile:
(1) does not contain its value in a VMCOREINFO note, and
(2) if the kernel image was loaded at a higher address than the
system's physical base address.
This may happen, for example, when analyzing a dynamically-created
ramdump-to-ELF dumpfile.
(zhaoqianli@xiaomi.com, anderson@redhat.com)
- Fix for Linux 4.16 and later ARM64 kernels that contain kernel commit
fa2a8445b1d3810c52f2a6b3a006456bd1aacb7e, titled "arm64: allow ID map
to be extended to 52 bits", and which have been configured with both
CONFIG_DEVMEM=y and CONFIG_STRICT_DEVMEM=y. Without the patch, an
inconsequential error message indicating "crash: read error: kernel
virtual address: <address> type: idmap_ptrs_per_pgd" is displayed
during initialization.
(anderson@redhat.com)
- Introduction of a new "bt -p" option that generates a backtrace of
the panic task, regardless of the current context. This option is
only applicable when running against dumpfiles in which the panic
task is known.
(atomlin@redhat.com)
- When the gdb-7.6.patch file is updated in an existing source tree,
it gets re-applied during the next build using "patch -N --fuzz=0",
which ignores patches that have already been applied. However, if
a gdb file has been modified multiple times, the secondary patching
may fail to recognize that a given patch has been previously applied,
and will attempt to re-apply it. To prevent any uninintended
consequences, the gdb-7.6.patch file will also act as a shell script
invoked by the Makefile, which restores any selected gdb file to its
original state prior to all secondary patch applications.
(anderson@redhat.com)
- As an addendum to the previous patch for updating the gdb-7.6.patch
in an existing pre-built source tree, when rebuilding for the ppc64
architecture, do not restore the selected gdb files. This is because
the gdb-7.6-ppc64le-support.patch will have modified the selected
files during the initial build.
(anderson@redhat.com)
- Extend the "timer" command with a new "TTE" column that displays the
remaining time in jiffies until the expiration of a timer entry, and
where a negative value displays the number of jiffies that have
elapsed since a timer has expired.
(oleksandr@redhat.com)
- Fix for a "warning: cast to pointer from integer of different size
[-Wint-to-pointer-cast]" compiler message generated by the previous
"timer" patch when compiling kernel.c on 32-bit architectures.
(anderson@redhat.com)
- Fix to the x86_64 "--machdep phys_base=<value>" command line option
to allow the use of a negative decimal number as the value. Without
the patch, only the hexadecimal representation of the value would be
accepted.
(v-santy@microsoft.com, anderson@redhat.com)
- Introduction of a new "rd -R" option, which will display memory in
reverse order. Memory will be displayed up to and including the
address argument, which requires that the count argument be greater
than 1 in order to display memory before the specified address.
(anderson@redhat.com)
- Add support for the "count" argument to be used in conjunction with
the "dis -r" and "dis -f" reverse/forward modes of operation. In
reverse mode, the specified "count" number of instructions leading
up to and including the target address will be displayed. In forward
mode, the display will be limited to "count" instructions. Without
the patch, using a count argument in either mode generates a "count
argument ignored" message, and the command proceeds as if it had
not been entered.
(anderson@redhat.com, atomlin@redhat.com)
- Fix a memory leak in the previous "dis" commit.
(anderson@redhat.com)
- Implemented a new "error" environment variable that sets the
destination of error messages. It can be set to either:
"default": error messages are always displayed on the
console; if the output of a command is piped to an
external command or redirected to a file, the error
messages are also sent to the pipe or file.
"redirect": if the output of a command is piped to an
external command or redirected to a file, error messages
are only sent to the pipe or file; otherwise they are
displayed on the console.
"filename": error messages are only sent to the specified
filename; they are not displayed on the console and
are not sent to a pipe or file.
(dkwon@redhat.com)
- Fix for the "kmem -n" option on Linux 5.3-rc1 and later kernels
that contain commit 326e1b8f83a4318b09033ef754f40c785aed5e68,
titled "mm/sparsemem: introduce a SECTION_IS_EARLY flag". Without
the patch, mem_map addresses containing the flag in bit 3 incorrectly
show it as part of the virtual address; with the patch, the option
displays the new "E" state flag.
(k-hagio@ab.jp.nec.com)
- Fix for the "timer" command in RHEL7.6 and later RHEL7 kernels.
Without the patch, the command emits extra faulty timer entries
because the tvec_root.vec[] and tvec.vec[] arrays are tracked using
hlist_head structures where list_head structures should be used.
(k-hagio@ab.jp.nec.com)
- crash-7.2.4 commit 6596f1121b added a "list -B" option to allow more
efficient enumeration of longer lists. There is a small bug with
this option where it may incorrectly flag a loop length of "0" on
list of length 1, indicating "list: loop detected, loop length: 0".
Since it is impossible to have a loop of length 0, the erroneous
message can be prevented by ensuring the list count is non-zero.
(dwysocha@redhat.com)
- Create the specified installation directory if it does not exist.
Without the patch, the Makefile's "make install" target will fail
if the INSTALLDIR and/or DESTDIR macros resolve to a non-existent
directory.
(pmenzel@molgen.mpg.de)
- Fix for the internal caching of the kernel's mem_map array of page
structures. Without the patch, in rare circumstances, commands such
as "kmem -p" may erroneously receive zero-filled page structures.
(k-hagio@ab.jp.nec.com)
- Fix to prevent a potential segmentation violation when accessing
the compressed configuration data contained in kernels that are
configured with CONFIG_IKCONFIG.
(chenqiwu@xiaomi.com)
- Determine the ARM64 SECTION_SIZE_BITS value using the following
order of precedence:
(1) from the VMCOREINFO data if it exists
(2) from the in-kernel configuration data if it exists
(3) the default value
(chenqiwu@xiaomi.com)
(09/20/19)
7.2.6-1.fc31
Information for build crash-7.2.6-1.fc31
https://koji.fedoraproject.org/koji/buildinfo?buildID=1263843
7.2.6 - Two fixes for the Xen hypervisor; the first fixes a bug seen
with Xen 4.11.0 during initialization, which fails with the error
message "crash: invalid kernel virtual address: <address> type:
fill_pcpu_struct", followed by "WARNING: cannot fill pcpu_struct"
and "crash: cannot read cpu_info". The second fix prevents a
segmentation violation associated with a crash-7.1.1 commit that
addressed the Xen 4.5.0 hypervisor symbol name change from
"dom0" to "hardware_domain".
(dietmar.hahn@ts.fujitsu.com)
- Fix for Linux 4.20 and later x86_64 kernels which are NOT
configured with CONFIG_RANDOMIZE_BASE. Linux 4.20 introduced
kernel commit d52888aa2753e3063a9d3a0c9f72f94aa9809c15, titled
"x86/mm: Move LDT remap out of KASLR region on 5-level paging",
which modified the 4-level and 5-level paging PAGE_OFFSET values.
Without this patch, the crash session fails during initialization
with the error message "crash: read error: kernel virtual address:
<address> type: tss_struct ist array". For kernels prior to
Linux 4.20.0 which have backports of the kernel commit, the kernel's
PAGE_OFFSET value must be manually specified via the command line
option "--machdep page_offset=ffff888000000000" for kernels with
4-level page tables, or "--machdep page_offset=ff11000000000000"
for kernels with 5-level paging. (or alternatively the shorter
version "-m page_offset=<address>" may be used). The command
line option requirement may be revisited in the future.
(anderson@redhat.com)
- Fix for the "p" command if the expression contains more than one
opening parenthesis character and a minus/dash sign. Without the
patch, the minus/dash sign will get dropped from the command prior
to it being passed on to gdb for evaluation, and the command will
fail with the message "p: gdb request failed: <expression>",
where the <expression> string will not contain the minus/dash sign.
(anderson@redhat.com)
- Fix for the internal parse_line() utility function to account for
embedded sets of parentheses, which may be used for expressions that
are passed to gdb by the "p" command. Without the patch, expressions
containing embedded sets of parentheses are broken up into multiple
argument tokens instead of just one. The previous commit has been
reverted by this one.
(anderson@redhat.com)
- First phase of support for ARM64 kernels that are configured with
CONFIG_ARM64_USER_VA_BITS_52, which causes the PTRS_PER_PGD count
to increase from 64 to 1024. Without the patch, "WARNING: cannot
access vmalloc'd module memory" will be displayed during session
initialization, and the translation of any mapped kernel virtual
address that requires a page table walk will fail, leading to a
myriad of other errors.
(anderson@redhat.com)
- Support for configurable CONFIG_ARM64_PA_BITS values introduced
in kernel commit 982aa7c5f0861bf56b2412ca341a13f44c238ba4, titled
"arm64: add kconfig symbol to configure physical address size".
Without the patch, it is impossible to determine the value of
CONFIG_ARM64_PA_BITS is, and will require a new MAX_PHYSMEM_BITS
vmcoreinfo entry to be exported. This patch reads that entry
during intitialization.
(anderson@redhat.com)
- For live system analysis where there is no vmcoreinfo ELF note
attached to /proc/kcore, or for dumpfile analysis where there is no
vmcoreinfo ELF note attached to the dumpfile, this patch sets the
internal pc->read_vmcoreinfo() function to a new plugin function
that reads the data directly from the live kernel or dumpfile.
Because the function is set much later during initialization than
if the ELF note is attached to /proc/kcore or the dumpfile, it may
not be available during very early session initialization.
(anderson@redhat.com)
- Fix for Linux 4.14.84 and later 4.14-based x86_64 kernels which
are NOT configured with CONFIG_RANDOMIZE_BASE and have backported
kernel commit d52888aa2753e3063a9d3a0c9f72f94aa9809c15, titled
"x86/mm: Move LDT remap out of KASLR region on 5-level paging",
which modified the 4-level and 5-level paging PAGE_OFFSET values.
Without this patch, the crash session fails during initialization
with the error message "crash: read error: kernel virtual address:
<address> type: tss_struct ist array".
(anderson@redhat.com)
- Fix for determining the x86_64 "phys_base" value in dumpfiles created
by the KVM "virsh dump" facility if the kernel is KASLR-enabled and
does not have the phys_base value stored in vmcoreinfo data. Without
the patch, the message "WARNING: cannot determine physical base
address: defaulting to 0" is displayed, and the crash session fails
to initialize.
(jiangran.jr@alibaba-inc.com)
- 32-bit ARM kernels built with the Thumb-2 instruction set utilize
the R7 register instead of FP for unwinding stacks using the DWARF
unwinder. On those kernels, without the patch, the "bt" command
only shows the task header.
(vincent.whitchurch@axis.com)
- Fix for the "kmem -z" option on Linux 5.0 and later kernels
that contain commit a921444382b49cc7fdeca3fba3e278bc09484a27,
titled "mm: move zone watermark accesses behind an accessor".
Without the patch, the command fails with the error message
"kmem: invalid (optional) structure member offsets: zone_pages_min
or zone_struct_pages_min".
(k-hagio@ab.jp.nec.com)
- Fix for the "kmem -i" option on Linux 5.0 and later kernels
that contain commit ca79b0c211af63fa3276f0e3fd7dd9ada2439839
titled "mm: convert totalram_pages and totalhigh_pages variables
to atomic". Without the patch, the command prints some incorrect
values, and besides does not print high/low memory information on
kernels which are configured with CONFIG_HIGHMEM.
(k-hagio@ab.jp.nec.com)
- Fix for the display of kernel module symbol types by the "sym"
command in Linux 5.0 and later kernels if the module debuginfo
data has not been loaded into the crash session. The st_info member
of the Elf32_Sym or Elf64_Sym structures has changed so as to not
contain ASCII symbol type characters, and as a result the "sym"
command will show unprintable data as the symbol type. With the
patch, only text types ("t" or "T") will be displayed, and the
symbols others will show "?".
(anderson@redhat.com)
- First phase of support of the upcoming ARM64 kernel memory map
changes to support 52-bit kernel virtual addressing, which allows
the configuration of CONFIG_ARM64_VA_BITS to be 52, but where the
actual number of VA bits may be downgraded during boot depending
upon the hardware capability. This phase is only applicable for
live system analysis.
(anderson@redhat.com)
- Fix for the "dis <function>" option with kernel module text
symbols on Linux 5.0 and later kernels. Without the patch, the
disassembly may stop prematurely or extend into the next function
because the st_size member of the Elf32_Sym or Elf64_Sym text
symbol structures can no longer be used as the function size.
(anderson@redhat.com)
- Commit dd12805ed1db7 in the linux-next kernel repository, titled
"XArray: Remove radix tree compatibility", changes the definition
of "radix_tree_root" back to be a struct. However, the content of
the new structure differs from the original structure, so without
the patch, current linux-next kernels fail during initialization
with the error message "radix trees do not exist or have changed
their format". Because the new "radix_tree_root" and "xarray"
structures have nearly the same layout, the existing functionality
for XArrays can be reused.
(prudo@linux.ibm.com)
- Fixes for the "trace.so" extension module:
(1) The reader_page can be empty if it was never read, do not record
it if it is empty. Better yet, do not record any page that is
empty. The struct buffer_page "real_end" is not available in
older kernels, so it needs to be tested if it exists before we
can use it.
(2) In newer kernels, the sp->type of kernel module symbols does not
contain the symbol type character unless the module's debuginfo
data has been loaded into the crash session. Writing a garbage
type to the kallsyms file for trace-cmd to read causes it to
crash, so just always write an 'm'.
(3) Add the "trace dump -t <trace.dat>" option to the SYNOPSIS line
of the help page.
(rostedt@goodmis.org)
- Fix to find the kernel configuration data in Linux 5.1 kernels
containing commit 13610aa908dcfce77135bb799c0a10d0172da6ba, titled
"kernel/configs: use .incbin directive to embed config_data.gz".
Without the patch, new kernels configured with CONFIG_IKCONFIG_PROC
will display "WARNING: could not find MAGIC_START!" during session
initialization, and also when running "sys config" during runtime.
(anderson@redhat.com)
- Fix for the PPC64 "bt" command running against kernels that are
configured with CONFIG_THREAD_INFO_IN_TASK. Without the patch,
the "bt" command fails with the message "bt: invalid/stale stack
pointer for this task: <address>".
(anderson@redhat.com)
- Fix for the "files -d <dentry>" option if the dentry.d_inode
pointer is NULL. Without the patch, the command output does not
display the super_block pointer or the file's pathname.
(martin.moore@hpe.com)
- When the is_s390_dump() function is called to determine whether
a file is an s390 dumpfile, it currently presumes that the fopen()
call always works, and then tries to read it with using a NULL file
pointer. Change it to verify that the fopen() was successful, and
if not, print an error message as is done with the other dumpfile
type verifier functions.
(ramin.blackhat@gmail.com)
- Implement support for ARM64 kernels that are configured with:
CONFIG_ARM64_PA_BITS=52
CONFIG_ARM64_64K_PAGES
CONFIG_PGTABLE_LEVELS=3
and that run on a host containing physical memory that utilizes
any bit in the uppermost 4 bits of the 52-bit physical address
range.
(anderson@redhat.com)
- Extension of the "snap.so" extension module to pass a second
architecture-specific value in the ELF header; its initial use
is for support of the upcoming ARM64 52-bit kernel virtual
address space by passing both the VA_BITS and VA_BITS_ACTUAL
values.
(anderson@redhat.com)
- Apply initial changes to support kernel address space layout
randomization (KASLR) for s390X. This is the minimal patch-set
required to process s390x dumps for the kernels configured with
CONFIG_RANDOMIZE_BASE, and to accept the "--kaslr" command line
option. Only dumpfiles whose headers contain kernel VMCOREINFO
data are supported.
(zaslonko@linux.ibm.com)
- Fix for the "dev -[dD]" options on Linux 5.1-rc1 and later kernels
that contain commit 570d0200123fb4f809aa2f6226e93a458d664d70, titled
"driver core: move device->knode_class to device_private". Without
the patch, the command options fail with the error message "dev:
invalid structure member offset: device_knode_class".
(k-hagio@ab.jp.nec.com)
- Linux 4.18 kernels introduced a new CONFIG_PROC_VMCORE_DEVICE_DUMP
configuration in commit 2724273e8fd00b512596a77ee063f49b25f36507,
titled "vmcore: add API to collect hardware dump in second kernel",
in which device drivers may collect a device specific snapshot of the
hardware/firmware state of their underlying devices, and export the
data as a kdump ELF note with type NT_VMCOREDD. This patch
recognizes the new ELF note(s) in both ELF and compressed kdump
vmcore dumpfiles. The "help -[nD]" option shows basic information
about each note, and two new "dev" command options have been
introduced. The "dev -V" option displays an indexed list of each
note, showing the device name, the dumpfile offset, and the size
of each note. The "dev -v index [file]" option either dumps the
contents of a note to the display screen in a human-readable format,
or copies the note data directly to a specified file.
(surendra@chelsio.com)
- If the kernel's "vmap_area_list" doubly-linked list is corrupt such
that it does not link back to the global list_head, commands that
require information regarding the range of virtually-mapped kernel
addresses will display a generic list-handling error message such as
"kmem: invalid list entry: 0", and the command will typically fail
to fully complete. However, without the patch, there will also be
"WARNING: malloc/free mismatch (29/30)" messages that get displayed
after every subsequent command. This patch prevents the mismatch
messages, and also adds an additional error message indicating
"WARNING: invalid/corrupt vmap_area_list" to further clarify the
generic list-handling error message.
(dwysocha@redhat.com, anderson@redhat.com)
- Fix for the "dev" help page to remove the unused -r option letter.
(surendra@chelsio.com)
- If a duplicate list entry is encountered when using the "list -B"
Brent algorithm, change the list loop length value from hexadecimal
to decimal.
(dwysocha@redhat.com)
- Update the README file to indicate the capability of building an
x86_64 crash binary with "make target=PPC64", which can be used to
analyze ppc64le dumpfiles on an x86_64 host.
(anderson@redhat.com)
- Fix for hybrid kernels that have backported support for the Xarray
facility while allowing subsystems to continue to use radix trees.
Without the patch, the crash session fails during initialization
with the message "crash: xarray facility does not exist or has
changed its format".
(anderson@redhat.com)
(05/03/19)
7.2.5-2.fc30
Information for build crash-7.2.5-2.fc30
https://koji.fedoraproject.org/koji/buildinfo?buildID=1185328
7.2.5 - Resurrection of the the "dev -p" option for displaying PCI device
data on Linux 2.6.26 and later kernels. The option was deprecated
as of Linux 2.6.26, and without the patch, the option would indicate
"dev: -p option not supported or applicable on this architecture
or kernel" when running against the newer kernel versions. PCI Bus
information will also be displayed with this patch.
(m.mizuma@jp.fujitsu.com)
- With Linux 4.19-rc1 commit 7d4340bb92a9df78e6e28152f3dd89d9bd82146b,
titled "powerpc/mm: Increase MAX_PHYSMEM_BITS to 128TB with
SPARSEMEM_VMEMMAP config", the PPC64 MAX_PHYSMEM_BITS value has
been bumped up to 47. The appropriate update has been made in
this patch.
(hbathini@linux.ibm.com)
- Fix to allow piping command output to a shell script beginning with
a shebang (#!) character sequence if the script pathname is specified
with a preceding "./" or "/". Without the patch, the piped command
fails with the message "crash: pipe operation failed".
(k-hagio@ab.jp.nec.com)
- Fix for the PPC64 "bt" command to recognize when a thread is running
in OPAL firmware. Without the patch, the "bt" command indicates
<task-address>: Invalid Stack Pointer <OPAL-firmware-address>"
(hbathini@linux.ibm.com)
- As an addendum to the "dev -p" patch above, add the new structure
member offsets for display by the "help -o" option.
(anderson@redhat.com)
- Enhancement to the "kmem -n" option to dump memory block information
if the kernel supports it. In addition, the memory section data
block has a new "STATE" column added to it.
(m.mizuma@jp.fujitsu.com)
- Addendum to the previous "kmem -n" patch to fix a FTBFS issue.
Without the patch, certain architectures fail to compile with the
error "memory.c:17315:16: error: 'PAGE_SHIFT' undeclared (first
use in this function)"
(m.mizuma@jp.fujitsu.com)
- Fix the calculation of the vmalloc memory region size to account for
Linux 4.17 commit a7412546d8cb5ad578805060b4006f2a021b5868, titled
"x86/mm: Adjust vmalloc base and size at boot-time", which increases
the region's size from 32TB to 1280TB when 5-level pagetables are
enabled. Also presume that virtual addresses above the end of the
vmalloc space up to the beginning of vmemmap space are translatable
via 5-level page tables. Without the patch, mapped virtual addresses
may fail translation in whatever command accesses them, with errors
indicating "seek error: kernel virtual address: <mapped-address>
type: <type-string>"
(anderson@redhat.com)
- Address several Coverity Scan "RESOURCE_LEAK" issues in the following
top-level source files: cmdline.c, kvmdump.c, lkcd_v8.c, xendump.c,
symbols.c, unwind_x86_32_64.c, va_server.c and va_server_v1.c.
(anderson@redhat.com)
- Modify the x86_64 "bt" behavior when a legitimate exception RIP value
cannot be referenced symbolically, such as when the exception occurs
while running in seccomp BPF filter code. Without the patch, the
exception frame register dump is preceded by "[exception RIP: unknown
or invalid address]", and then followed by "bt: WARNING: possibly
bogus exception frame". With the patch applied, the translation of
the exception RIP will show "[exception RIP: no symbolic reference]",
and there will be no warning message.
(anderson@redhat.com)
- Account for the /proc/kcore VMCOREINFO PT_NOTE in Linux 4.19 and
later kernels having commit 23c85094fe1895caefdd19ef624ee687ec5f4507,
titled "proc/kcore: add vmcoreinfo note to /proc/kcore". The PT_NOTE
information is stored during session initialization for later display
by "help -[n|D]"; a subsequent commit will make it available for use
by the crash utility's internal pc->read_vmcoreinfo() function.
(anderson@redhat.com)
- Second phase of support for the VMCOREINFO PT_NOTE added to the ELF
header of /proc/kcore in Linux 4.19 and later kernels. This patch
introduces support for live session /proc/kcore VMCOREINFO access by
the crash utility's internal pc->read_vmcoreinfo() function. New
usage include the initialization of the x86_64 phys_base value, and
the arm64 phys_offset, page size, and VA bits count.
(anderson@redhat.com)
- Fix for Linux 4.20-rc1 and later kernels that contain kernel commit
5c83511bdb9832c86be20fb86b783356e2f58062, titled "x86/paravirt: Use
a single ops structure". Without the patch, the kernel may be
misidentified as an ARCH_XEN kernel, with the most noticable result
being the inability to read vmemmap'd page structures.
(anderson@redhat.com)
- Implemented the functionality for a new MEMBER_TYPE_NAME() macro,
which will return a pointer to the type name string of a structure
member. It is being put in place for the support of Linux 4.20
radix tree to xarray replacements, where structure member types may
be changed from radix_tree_root structures to xarray structures.
(anderson@redhat.com)
- First phase of support for the XArray facility. The added support is
similar to that of radix trees, but introduces completely separate
functions, structures and #defines. None of the applicable radix
tree users in the crash utility have been switched over, so this
phase does not introduce any functional changes.
(asmadeus@codewreck.org, anderson@redhat.com)
- Second phase of support for the XArray facility, which handles the
switch-over of PID handling from a radix tree to an XArray in Linux
4.20 and later kernels. Without the patch, the crash session fails
during session initialization with the message "crash: radix trees
do not exist or have changed their format".
(asmadeus@codewreck.org, anderson@redhat.com)
- Third phase of support for the XArray facility, which consolidates
the radix_tree_pair and xarray_pair structures into a unified
list_pair structure that is used by both facilities, and fixes the
"bpf" command. Without the patch, the command fails on Linux 4.20
or later kernels with the error message "bpf: radix trees do not
exist or have changed their format".
(anderson@redhat.com)
- Added support for usage of the XArray facility by the "files -p"
option. Without the patch, the command fails on Linux 4.20 and later
kernels with the error message "files: radix trees do not exist or
have changed their format".
(anderson@redhat.com)
- Added support for usage of the XArray facility by the "irq" command.
Without the patch, the command fails on Linux 4.20 and later kernels
with the error message "irq: radix trees do not exist or have changed
their format".
(anderson@redhat.com)
- Added support for usage of the XArray facility by the "ipcs" command.
Without the patch, the command may fail on Linux 4.20 and later
kernels with the error message "irq: radix trees do not exist or have
changed their format".
(anderson@redhat.com)
- Added a new "tree -t xarray" option to display of the contents of
an XArray in Linux 4.20 and later kernels. The implementation is
similar to that of radix tree displays, but in addition, the "-p"
option will also display the index value of each entry in a radix
tree or XArray.
(anderson@redhat.com)
- Fix for the "files -p <inode>" option on a file with a large
number of pages. Without the patch, the command attempts to read
radix tree node slot entries that are RADIX_TREE_EXCEPTIONAL_ENTRY
types instead of page pointers, and as a result may fail with a
dump of the internal buffer allocation stats followed by the message
"files: cannot allocate any more memory!".
(anderson@redhat.com)
- Fix for the "ps -s" option on ARM64 if the number of tasks exceeds
2000. Without the patch, the command ultimately fails with a
dump of the internal buffer allocation stats, followed by the
message "ps: cannot allocate any more memory!".
(anderson@redhat.com)
- With Linux 4.20-rc1 commit 4ffe713b7587b14695c9bec26a000fc88ef54895,
titled "powerpc/mm: Increase the max addressable memory to 2PB",
the PPC64 MAX_PHYSMEM_BITS value has been bumped up to 51 for
CONFIG_SPARSEMEM_VMEMMAP and CONFIG_SPARSEMEM_EXTREME. The
appropriate update has been made in this patch.
(hbathini@linux.ibm.com)
- Implemented a new plugin function for the readline library's tab
completion feature. Without the patch, the use of the default plugin
from the embedded gdb module has been seen to cause segmentation
violations or other fatal malloc/free/corruption assertions. The new
plugin takes gdb out of the picture entirely, and also restricts the
matching options to just symbol names, so as not to clutter the
results with irrelevant filenames.
(anderson@redhat.com)
- The RHEL8 kernel will contain a backport of the Linux 4.19 kernel
commit 7d4340bb92a9df78e6e28152f3dd89d9bd82146b, titled "powerpc/mm:
Increase MAX_PHYSMEM_BITS to 128TB with SPARSEMEM_VMEMMAP config".
As a result, the use of the THIS_KERNEL_VERSION() macro by the
crash utility does not suffice for determining the MAX_PHYSMEM_BITS
value for PPC64. The appropriate update has been made in this patch.
(anderson@redhat.com)
- Fix for an initialization-time session failure when all three of the
following conditions exist:
(1) invoking the session with "crash -d2" or larger debug number
(2) running against a Linux 3.3 or later kernel
(3) using a post-7.2.4 crash utility that has the new "kmem -n"
support above for the display of memory blocks
Without the patch, the crash session fails with the error message
"crash: invalid structure member offset: device_kobj".
(anderson@redhat.com)
- Fix for an initialization-time segmentation violation when invoking
crash-7.2.4 or later with "crash -d2" or larger debug number.
(anderson@redhat.com)
- Add a write operation handler to the sample /dev/crash memory driver
that enables writing to kernel memory via the "wr" command.
(serapheim@delphix.com)
- Prevent a SIGSEGV if a user attempts to input a command line that
exceeds the maximum length of 1500 bytes. The patch displays an
error message and ignores the command line.
(anderson@redhat.com)
- Fix for the "dev -[dD]" options in kernels containing Linux 5.0-rc1
commit 7ff4f8035695984c513598e2d49c8277d5d234ca, titled "block:
remove dead queue members", in which the number of I/Os issued to
a disk driver are no longer stored in the request_queue structure.
Without the patch, the options indicate "dev: -d option not supported
or applicable on this architecture or kernel". With the patch, the
"DRV" column is not shown.
(m.mizuma@jp.fujitsu.com)
- A crash-7.1.1 commit added support for Linux version 5.x. To prevent
surprise failures due to unexpected kernel version bumps in the
future, support has been added for version 6, keeping it one step
ahead.
(anderson@redhat.com)
- Fix for a gcc-9 compilation error that occurs if an inline asm
statement clobbers the stack pointer. Without the patch, x86 and
x86_64 builds will fail to compile gdb-7.6/gdb/common/linux-ptrace.c,
generating an error that indicates "error: Stack Pointer register
clobbered by '%rsp' in 'asm'".
(anderson@redhat.com)
(01/10/19)
7.2.4-1.fc30
Information for build crash-7.2.4-1.fc30
https://koji.fedoraproject.org/koji/buildinfo?buildID=1147344
7.2.4 - Fix for the "timer -r" command on Linux 4.10 and later kernels that
contain commit 2456e855354415bfaeb7badaa14e11b3e02c8466, titled
"ktime: Get rid of the union". Without the patch, the command fails
with the error message "timer: invalid structure member offset:
ktime_t_sec".
(k-hagio@ab.jp.nec.com)
- Fix for the x86 and x86_64 "mach -m" option on Linux 4.12 and later
kernels to account for the structure name changes "e820map" to
"e820_table", and "e820entry" to "e820_entry", and for the symbol
name change from "e820" to "e820_table". Also updated the display
output to properly translate E820_PRAM and E820_RESERVED_KERN entries.
Without the patch on all kernels, E820_PRAM and E820_RESERVED_KERN
entries show "type 12" and "type 128" respectively. Without the
patch on Linux 4.12 and later kernels, the command fails with the
error message "mach: cannot resolve e820".
(anderson@redhat.com)
- Update for the recognition of the new x86_64 CPU_ENTRY_AREA virtual
address range introduced in Linux 4.15. The memory range exists
above the vmemmap range and below the mapped kernel static text/data
region, and where all of the x86_64 exception stacks have been moved.
Without the patch, reads from the new memory region fail because the
address range is not recognized as a legitimate virtual address.
Most notable is the failure of "bt" on tasks whose backtraces
originate from any of the exception stacks, which fail with the two
error messages "bt: seek error: kernel virtual address: <address>
type: stack contents" followed by "bt: read of stack at <address>
failed".
(anderson@redhat.com)
- Fix to address a "__builtin___snprintf_chk" compiler warning if bpf.c
is compiled with -D_FORTIFY_SOURCE=2.
(anderson@redhat.com)
- Fix for the "bpf -t" option. Although highly unlikely, without the
patch, the target function name of a BPF bytecode call instruction
may fail to be resolved correctly.
(anderson@redhat.com)
- If /proc/kcore gets selected for the live memory source because
/dev/mem was configured with CONFIG_STRICT_DEVMEM, its ELF header
contents are not displayed by "help -[dD]", and are not displayed
when the crash session is invoked with -d<number>". Without the
patch, the ELF contents are only displayed in those two situations
if "/proc/kcore" is explicitly entered on the crash command line.
(anderson@redhat.com)
- If the default live memory source /dev/mem is determined to be
unusable because the kernel was configured with CONFIG_STRICT_DEVMEM,
the first memory read during session initialization will fail. The
current behavior results in a readmem() error message, followed by two
notification messages that indicate that /dev/mem is restricted and
a switch to using /proc/kcore will be attempted; the readmem is
reattempted from /proc/kcore, and if successful, the session will
continue initialization. With this patch, the behavior will change
such that if the switch to /proc/kcore and the reattempted readmem()
are successful, no messages will be displayed unless the crash
session is invoked with "crash -d<number>".
(anderson@redhat.com)
- Fix for the ppc64/ppc64le "bt" command on Linux 4.7 and later kernels
that contain commit d8bff643d81a58181356c0aa3ab771ac10da6894,
titled "[x86] asm: Make sure verify_cpu() has a good stack", which
inadvertently breaks the ppc64/ppc64le kernel stack size calculation
when running with crash-7.2.2 or later. Without the patch, "bt" may
fail with a filtered kdump dumpfile with the two error messages
"bt: page excluded: kernel virtual address: <address> type: stack
contents" and "bt: read of stack at <address> failed".
(anderson@redhat.com)
- Fix for PPC64 kernel virtual address translation in Linux 4.17 and
later kernels with commit c2b4d8b7417a59b7f9a52d0d8402f5257cbbd398,
titled "powerpc/mm/hash64: Increase the VA range", in which the
maximum virtual address value has been increased to 4PB. Without
the patch, the translation/access of high vmalloc space addresses
fails; for example, the "kmem -[sS]" option fails the translation
of per-cpu kmem_cache_cpu addresses located in vmalloc space, with
the error messages "kmem: invalid kernel virtual address: <address>
type: kmem_cache_cpu.freelist" and "kmem: invalid kernel virtual
address: <address> type: kmem_cache_cpu.page", and the "vtop"
command shows the addresses as "(not mapped)".
(hbathini@linux.ibm.com)
- Fix for the x86_64 "bt" command in which a legitimate exception
frame is appended with the message "bt: WARNING: possibly bogus
exception frame". This only happens in KASLR-enabled kernels when
the text address that was executing when the exception occurred
is marked as a "weak" symbol (type "W") instead of a text symbol
(type "T" or "t"). As a result, the exception frame's RIP is not
recognized as a text symbol, and the warning message is displayed.
(anderson@redhat.com)
- Fix for the x86_64 "bt" command in Linux 4.16 and later kernels
containing commit 3aa99fc3e708b9cd9b4cfe2df0b7a66cf293e3cf, titled
"x86/entry/64: Remove 'interrupt' macro". Without the patch, the
exception frame display generated by an interrupt exception will
show incorrect contents, and be followed by the message "bt: WARNING:
possibly bogus exception frame".
(anderson@redhat.com)
- Fix for the failure of several "kmem" command options, most notably
seen if the command is piped directly into a crash session, or if
the command is contained in an input file. For examples:
$ echo "kmem -i" | crash ...
$ crash -i <input-file> ...
Without the patch, the kmem command may fail with the error message
"<segmentation violation in gdb>". While the bug is due to a buffer
overflow that has always existed, it only is triggered by certain
kernel configurations.
(anderson@redhat.com)
- Update for the "kmem -V" option to also dump the global entries that
are contained in the "vm_numa_stat" array that was introduced in
Linux 4.14. Also, the command output separates the "vm_zone_stat",
"vm_node_stat" and "vm_numa_stat" entries into separate sections with
"VM_ZONE_STAT", "VM_NODE_STAT" and "VM_NUMA_STAT" headers. Without
the patch, the "vm_zone_stat" and "vm_node_stat" entries are listed
together under a "VM_STAT" header.
(anderson@redhat.com)
- Support for the "bpf" command on RHEL 3.10.0-913.el7 and later
3.10-based RHEL7 kernels, which contain a backport of the upstream
eBPF code, but still use the older, pre-4.11, IDR facility that does
not use radix trees for linking the active bpf_prog and bpf_map
structures. Without the patch, the command indicates "bpf: command
not supported or applicable on this architecture or kernel".
(anderson@redhat.com)
- Third phase of support for x86_64 5-level page tables in Linux 4.17
and later kernels. With this patch, the usage of 5-level page tables
is automatically detected on live systems and when running against
vmcores that contain the new "NUMBER(pgtable_l5_enabled)" VMCOREINFO
entry. Without the patch, the "--machdep vm=5level" command line
option is required.
(douly.fnst@cn.fujitsu.com, anderson@redhat.com)
- The existing "list" command uses a hash table to detect duplicate
items as it traverses the list. The hash table approach has worked
well for many years. However, with increasing memory sizes and list
sizes, the overhead of the hash table can be substantial, often
leading to commands running for a very long time. For large lists,
we have found that the existing hash based approach may slow the
system to a crawl and possibly never complete. You can turn off
the hash with "set hash off" but then there is no loop detection; in
that case, loop detection must be done manually after dumping the
list to disk or some other method. This patch is an implementation
of the cycle detection algorithm from R. P. Brent as an alternative
algorithm for the "list" command. The algorithm both avoids the
overhead of the hash table and yet is able to detect a loop. In
addition, further loop characteristics are printed, such as the
distance to the start of the loop as well as the loop length.
An excellent description of the algorithm can be found here on
the crash-utility mailing list:
https://www.redhat.com/archives/crash-utility/2018-July/msg00019.html
A new "list -B" option has been added to the "list" command to
invoke this new algorithm rather than using the hash table. In
addition to low memory usage, the output of the list command is
slightly different when a loop is detected. In addition to printing
the first duplicate entry, the length of the loop, and the distance
to the loop is output.
(dwysocha@redhat.com)
- Fix for x86_64 "bt" command to prevent an in-kernel exception frame
from not being displayed. Without the patch, if the RIP in a pt_regs
structure on the stack is not a kernel text address, such as a NULL
pointer, it is not recognized as an exception frame and the register
set is not displayed.
(anderson@redhat.com)
- Fix for the "repeat" command when the argument consists of an input
file construct, for example, "repeat -1 < input_file". Without the
patch, only the first command line in the input file is executed
each time.
(anderson@redhat.com)
- Fourth phase of support for x86_64 5-level page tables in Linux 4.17
and later kernels. This patch adds support for user virtual address
translation when the kernel is configured with CONFIG_X86_5LEVEL.
(douly.fnst@cn.fujitsu.com)
- Fix to prevent an unnecessary "read error" message during session
initialization on live systems running a kernel that is configured
with CONFIG_X86_5LEVEL. Without the patch, a message indicating
"crash: read error: kernel virtual address: <address> type:
__pgtable_l5_enabled" will be displayed if /proc/kcore gets
selected as the live memory source after /dev/mem is determined
to be unusable.
(anderson@redhat.com)
- Update for "ps" and "foreach" commands to display and recognize two
new process states, "ID" for the TASK_IDLE macro introduced in
Linux 4.2, and "NE" for the TASK_NEW bit introduced in Linux 4.8.
(k-hagio@ab.jp.nec.com)
- Fix for running live on ARM64 kernels against /proc/kcore on kernels
configured with CONFIG_RANDOMIZE_BASE. Without the patch, depending
upon the hardware platform, the session may fail with the error message
"crash: vmlinux and /proc/kcore do not match!".
(anderson@redhat.com)
- Modify the output of the "kmem -[sS]" header and contents such that
the slab cache "NAME" string is moved from the second column to the
the last column. Since the slab cache name strings have become
increasingly longer over time, without the patch, the numerical
column contents may be skewed so far to the right that the output
becomes difficult to read.
(k-hagio@ab.jp.nec.com)
- Fix for the "files" and "net -s" commands when a task has an open
files count that exceeds 1024 (FD_SETSIZE) file descriptors. Without
the patch, the commands may omit the display of open file descriptors.
(tan.hu@zte.com.cn)
- As an addendum to the new "kmem -[sS]" output format, align the slab
cache name string so that it is beneath the "NAME" header column when
the "kmem -I <slab-cache>" option is used to ignore a slab cache,
or if the scan of the metadata of a slab cache enounters corruption.
Also remove a superfluous line from the "help kmem" description of
the "kmem -I" option.
(k-hagio@ab.jp.nec.com, anderson@redhat.com)
- Account for the addition of the new ORC unwinder "orc_entry.end"
member in kernel commit d31a580266eeb1f355df90fde8a71f480e30ad70,
titled "x86/unwind/orc: Detect the end of the stack".
(anderson@redhat.com)
- Fix for the "trace.c" extension module for RHEL7.6, which moved the
ftrace_event_call.data member into a new structure contained within
an anonymous union. Without the patch, the module fails to load,
indicating "no commands registered: shared object unloaded".
(xuhuan.fnst@cn.fujitsu.com)
- Fix for the "vm -p", user-space "vtop", and "pte" commands in kernels
where the dimension of the static swap_info[] array is not contained
in the vmlinux file's debuginfo data. Without the patch, the
translation of a swapped-out PTE entry fails to determine the swap
device, and the commands display "cannot determine swap location".
(anderson@redhat.com)
- Fix for the swap offset calculation in the x86_64 "vm -p", "pte", and
user-space "vtop" commands. The swap offset bits in an x86_64 PTE
were changed in Linux 4.6, and then again in Linux 4.18.1 with the
new L1TF security patchset. Without the patch, the offset value
in the later kernels, or in older kernels with an L1TF backport,
show an incorrect swap offset value.
(anderson@redhat.com)
- Fix for the "kmem -V" option on Linux 4.14 and later kernels that are
configured without CONFIG_NUMA, and therefore do not contain the
"numa_stat_item" enumeration. Without the patch, the command causes
the crash session to abort with the error messages "double free or
corruption (!prev)" followed by "Aborted (core dumped)".
(k-hagio@ab.jp.nec.com)
- Introduction of a new "kmem -r" option. With the implementation of
per-cgroup kmem_cache slabs, the number of slab caches displayed by
"kmem -s" can number into the thousands. Similar to /proc/slabinfo,
this new option displays the accumulated data of the root cache and
its children. It is limited to Linux 4.11 and later kernels that
contain the "slab_root_caches" list. Currently the command option
is restricted to kernels configured with CONFIG_SLUB.
(k-hagio@ab.jp.nec.com)
- Fix for Linux 4.19-rc1 and later kernels that contain kernel commit
2c4704756cab7cfa031ada4dab361562f0e357c0, titled "pids: Move the pgrp
and session pid pointers from task_struct to signal_struct". Without
the patch, the crash session fails during initialization with the
message "crash: invalid structure member offset: task_struct_pids".
(anderson@redhat.com)
- Fix for Linux 4.19-rc1 and later kernels that contain kernel commit
7290d58095712a89f845e1bca05334796dd49ed2, titled "module: use
relative references for __ksymtab entries". Without the patch,
kernels configured with CONFIG_HAVE_ARCH_PREL32_RELOCATIONS fail
during session initialization, with a dump of the internel buffer
allocation stats followed by the message "crash: cannot allocate
any more memory!"
(asmadeus@codewreck.org)
- Fix a cut-and-paste error in the previous patch application.
(anderson@redhat.com)
- Fix for the "files" command in Linux 4.17 and later kernels that
contain commit b93b016313b3ba8003c3b8bb71f569af91f19fc7, titled
"page cache: use xa_lock". Without the patch, the "files -c" option
fails with the message "files: -c option not supported or applicable
on this architecture or kernel", and the "files -p <inode>" option
fails in a similar manner.
(k-hagio@ab.jp.nec.com)
- Fix for the "files -p <inode>" option. Without the patch, the
command attempts to translate radix tree node slot entries that
are RADIX_TREE_EXCEPTIONAL_ENTRY types, and as a result may fail
prematurely with an error message of the sort "files: do_radix_tree:
callback operation failed: entry: 5 item: 44788c5000a".
(anderson@redhat.com)
- Commit 3db3d3992d781c1e42587d2d2bf81e785408e0c2 in crash-7.1.8 was
aimed at making the PPC64 "bt" command work for dumpfiles saved
with the FADUMP facility, but it introduced a bit of unwarranted
complexity in "bt" command processing. Reworked the "bt" command
processing for PPC64 arch to make it a little less compilated and
also to print symbols for NIP and LR registers in exception frames.
Without the patch, "bt" on non-panic active tasks may fail with
the message "bt: invalid kernel virtual address: <address>
type: Regs NIP value".
(hbathini@linux.ibm.com)
- An addendum to crash commit 5fe78861ea1589084f6a2956a6ff63677c9269e1,
this patch for the PPC64 "bt" command prevents an invalid error
message from being displayed when an active non-panic task is
interrupted while running in user space. Without the patch, the
command correctly indicates "Task is running in user space", dumps
the user-space exception frame, but then prints the invalid error
message "bt: invalid kernel virtual address: ffffffffffffff90 type:
Regs NIP value".
(anderson@redhat.com)
(09/21/18)
7.2.3-1.fc29
Information for build crash-7.2.3-1.fc29
https://koji.fedoraproject.org/koji/buildinfo?buildID=1083119
7.2.3 - Fix for a crash-7.2.2 regression that may cause the "mount"
command to generate a segmentation violation. The bug is
dependant upon the compiler version used to build the crash
utility, where a buffer overrun is not seen with more recent
versions of gcc, which hide the bug due to a different stack
layout of a function's local varibles.
(anderson@redhat.com)
- Fix for a second crash-7.2.2 buffer overrun regression that may
cause the "rd -S" option to generate a segmentation violation
if a displayed memory location contains a slab object address.
(anderson@redhat.com)
- Fix for a third, highly unlikely, crash-7.2.2 buffer overrun
regression, that could potentially occur during session
initialization.
(anderson@redhat.com)
(05/17/18)
7.2.2 - Fix to support Linux 4.16-rc1 and later ARM64 kernels, which
fail during session initialization with the error message
"crash: cannot determine page size". The failure to determine
the page size is due to the combination of the following kernel
commits:
- Linux 4.6 commit 6ad1fe5d9077a1ab40bf74b61994d2e770b00b14
arm64: avoid R_AARCH64_ABS64 relocations for Image header fields
- Linux 4.10 commit 4b65a5db362783ab4b04ca1c1d2ad70ed9b0ba2a
arm64: Introduce uaccess_{disable,enable} functionality based on TTBR0_EL1
- Linux 4.16 commit 1e1b8c04fa3451e2b7190930adae43c95f0fae31
arm64: entry: Move the trampoline to be before PAN
(takahiro.akashi@linaro.org)
- Fix the search for the booted kernel on a live system to prevent
selecting the unusable "vmlinux.o" file found in private build
directories. Without the patch, the non-executable vmlinux.o file
may be selected, and the resulting fatal error message indicates a
somewhat misleading "crash: cannot resolve _stext".
(bhsharma@redhat.com, anderson@redhat.com)
- Implemented a new "ps -A" option that restricts the task output to
just the active tasks on each cpu.
(atomlin@redhat.com)
- As the first step in optimizing the is_page_ptr() function, save
the maximum SPARSEMEM section number during initialization, and
use it as the topmost delimeter in subsequent mem_section searches.
Also allow for per-architecture machdep->is_page_ptr() plugin functions.
(anderson@redhat.com)
- Implemented the x86_64 machdep->is_page_ptr() plugin function. If
the kernel is configured with CONFIG_SPARSEMEM_VMEMMAP, the plugin
function optimizes the mem_section search, reducing the computation
effort and time consumed by commands that repeatedly call the
is_page_ptr() function on large-memory systems.
(k-hagio@ab.jp.nec.com)
- Fixes for 32-bit X86 "bt" command on kernels that have been compiled
with retpoline gcc support. Without the patch, backtraces may fail
with the error message "bt: cannot resolve stack trace", followed by
the text symbols found on the stack and possible exception frames.
(anderson@redhat.com)
- Fix the "help foreach" argument list to include the new "gleader"
task qualifier option that was added in version 7.1.2.
(anderson@redhat.com)
- VMware VMSS dumpfiles contain the state of each vCPU at the time
when the VM was suspended. This patch enables crash to read the
relevant registers from each vCPU state for use as the starting hooks
by the "bt" command. Also, support for "help -[D|n]" to display
dumpfile contents, and "help -r" to display vCPU register sets has
been implemented. This is also the first step towards implementing
automatic KASLR offset calculations for VMSS dumpfiles.
(slp@redhat.com)
- Commit 45b74b89530d611b3fa95a1041e158fbb865fa84 added support for
calculating phys_base and the mapped kernel offset for KASLR-enabled
kernels on SADUMP dumpfiles by using a technique developed by Takao
Indoh. Originally, the patchset included support for kdumps, but this
was dropped in v2, as it was deemed unnecessary due to the upstream
implementation of the "vmcoreinfo device" in QEMU. However, there
are still several reasons for which the vmcoreinfo device may not be
present at the time when a memory dump is taken from a VM, ranging
from a host running older QEMU/libvirt versions, to misconfigured VMs
or environments running Hypervisors that doesn't support this device.
This patchset generalizes the KASLR-related functions from sadump.c
and moves them to kaslr_helper.c, and makes kdump analysis fall back
to KASLR offset calculation if vmcoreinfo data is missing.
(slp@redhat.com)
- Fix for the "bt" command on 4.16 and later kernels size in which the
"thread_union" data structure is not contained in the vmlinux file's
debuginfo data. Without the patch, the kernel stack size is not
calculated correctly, and defaults to 8K. As a result "bt" fails
with the message "bt: invalid RSP: <address> bt->stackbase/stacktop:
<address>/<address> cpu: <number>".
(efault@gmx.de)
- Fix for the x86_64 "bt" command for kernels that are configured with
CONFIG_FRAME_POINTER. Without the patch, the per-text-return-address
framesize cache may contain invalid entries for functions that have
an "and $0xfffffffffffffff0,%rsp" instruction in their prologue,
which aligns the stack on a 16-byte boundary; therefore any cached
framesize for a text-return-address in such a function may be
incorrect depending upon the alignment of the stack address of a
calling function. If an invalid cached framesize is utilized by
"bt", the backtrace may skip over several frames, or may display
one or more invalid (stale) frames. The patch introduces a new
cache that contains functions for which framesize values should
not be cached.
(anderson@redhat.com)
- Speed up the "bt" command by avoiding the text value cache that
was put in place many years ago when the crash utility supported the
analysis of remote dumpfiles using the deprecated "crash daemon"
running on the remote host. The performance improvement will be
most noticable when running the first instance of "foreach bt",
where there would often be a "hitch" when it was determining the
framesize of kernel module text return addresses.
(anderson@redhat.com)
- Optimization of the crash startup time and "ps" command processing
time when analyzing dumpfiles/systems with extremely large task
counts. For example, running with a dumpfile containing over a
million tasks, startup time and "ps" processing time was reduced
from 90 minutes to less then 40 seconds.
(gthelen@google.com)
- Speed up the "ps -r" option by stashing the length of the
task_struct.rlim or signal_struct.rlim array in the internal
array_table[]. Without the patch, the length of the array
is determined by a call to the embedded gdb module for each
task, and as a result, the command takes a minute or more
per 1000 tasks. With the patch applied, it only takes about
0.5 seconds per 1000 tasks.
(k-hagio@ab.jp.nec.com)
- Added a new "tree -l" option for the rbtree display, which dumps
the tree sorted in linear order, starting with the leftmost node and
progressing to the right. Also, if a corrupted rb_node pointer is
encountered, do not fail immediately, but rather display the rb_node
address and the corrupt pointer and continue.
(neelx@redhat.com)
- Display a fatal error message if the "tree -l" option is attempted
with radix trees. Without the patch, the option would be silently
ignored.
(neelx@redhat.com)
- Introduction of a new "bpf" command that displays information about
loaded eBFP (extended Berkeley Packet Filter) programs and maps.
Because of its upstream fluidity, the capabilities of this command
will be an ongoing task. In its initial form, the command displays
the addresses, basic information, and key data structures of eBPF
programs and maps. It also translates the bytecode, and disassembles
the jited code, of loaded eBPF programs.
(anderson@redhat.com)
- Fixes to address several gcc-8.0.1 compiler warnings that are generated
when building with "make warn". The warnings are all false alarm
messages of type [-Wformat-overflow=], [-Wformat-truncation=] and
[-Wstringop-truncation]; the affected files are extensions.c, task.c,
kernel.c, memory.c, remote.c, symbols.c, filesys.c and xen_hyper.c.
(anderson@redhat.com)
- Fix for the "ps -a" option for a user task that has utilized
"prctl(PR_SET_MM, ...)" to self-modify its memory map such
that the stack locations of its command line arguments and
environment variables such are not contiguous. Without the
patch, the command may fail with a dump of the crash utility's
internal buffer usage statistics followed by "ps: cannot allocate
any more memory!".
(k-hagio@ab.jp.nec.com)
- Fix for a compilation error on ARM64. Without the patch, the
compilation of the new bpf.c file fails with the error message
"bpf.c:881:18: error: conflicting types for 'u64'"
(anderson@redhat.com)
- Fix for an s390x session initialization-time warning that indicates
"WARNING: cannot determine MAX_PHYSMEM_BITS" on Linux 4.15 and later
kernels containing commit 83e3c48729d9ebb7af5a31a504f3fd6aff0348c4,
which changed the data type of "mem_section" from an array to a
pointer. Without the patch, the s390x manner of determining
MAX_PHYSMEM_BITS fails because it presumes that "mem_section" is
an array, and as a result, displays the warning message.
(anderson@redhat.com)
- Fix for the determination of the ARM64 phys_offset value when
running live against /proc/kcore. Without the patch, the message
"WARNING: cannot access vmalloc'd module memory" may be displayed
during session initialization, and vmalloc/module memory will be
unaccessible. (It should be noted that at the time of this patch,
the upstream version of /proc/kcore does not work correctly for
ARM64, because PT_LOAD segments for unity-mapped blocks of physical
are not generated.)
(anderson@redhat.com)
- For live system analysis, if both "/dev/mem" and the "/dev/crash"
memory driver do not exist, try to use "/proc/kcore". Without
the patch, the session fails immediately with the error message
"crash: /dev/mem: No such file or directory".
(anderson@redhat.com)
- Fix, and an update, for the "ipcs" command. The fix addresses an
error where IPCS entries are not displayed because of a faulty
read of the "deleted" member of the embedded "kern_ipc_perm" data
structure. The "deleted" member was being read as a 4-byte integer,
but since it is declared as a "bool" type, only the lowest byte gets
set to 1 or 0. Since the structure is not zeroed-out when allocated,
stale data may be left in the upper 3 bytes, and the IPCS entry
gets rejected. The update is required for Linux 4.11 and greater
kernels, which reimplemented the IDR facility to use radix trees
in kernel commit 0a835c4f090af2c76fc2932c539c3b32fd21fbbb, titled
"Reimplement IDR and IDA using the radix tree". Without the patch,
if any IPCS entry exists, the command would fail with the message
"ipcs: invalid structure member offset: idr_top"
(anderson@redhat.com)
- Second stage of the new "bpf" command. This patch adds additional
per-program and per-map data for the "bpf -p ID" and "bpf -m ID"
options, containing data items shown by the "bpftool prog list"
and "bpftool map list" options; new "bpf -P" and "bpf -M" options
have been added that dump the extra data for all loaded programs
or tasks.
(anderson@redhat.com)
- Fix for a compilation error of the new "bpf.c" file when building
on older host systems where CLOCK_BOOTTIME does not exist.
(anderson@redhat.com)
- Fix for infrequent failures of the x86 "bt" command to handle cases
where a user space task with "resume_userspace" or "entry_INT80_32"
at the top of the stack, or which was interrupted by the crash NMI
while handling a timer interrupt. Without the patch, the backtrace
would be proceeded with the error message "bt: cannot resolve stack
trace", and then dump the text symbols found on the stack and all
possible exception frames.
(anderson@redhat.com)
- Trivial formatting fix to "bpf" help page.
(anderson@redhat.com)
- Fix the "bpf" command display on Linux 4.17-rc1 and later kernels,
which contain two new program types, BPF_PROG_TYPE_RAW_TRACEPOINT
and BPF_PROG_TYPE_CGROUP_SOCK_ADDR. Without the patch, the dynamic
header string created for bpf programs overran into the bpf map
header, creating one long combined header string.
(anderson@redhat.com)
- Updates for the presumption that system call names begin with "sys_".
In Linux 4.17, x86_64 system calls may begin with "__x64_sys", where,
for example, "sys_read" has been replaced by "__x64_sys_read".
(anderson@redhat.com)
(05/16/18)
7.2.1-2.fc29
Information for build crash-7.2.1-2.fc29
https://koji.fedoraproject.org/koji/buildinfo?buildID=1049079
7.2.1-2.fc28
Information for build crash-7.2.1-2.fc28
https://koji.fedoraproject.org/koji/buildinfo?buildID=1049179
7.2.1-1.fc28
Information for build crash-7.2.1-1.fc28
https://koji.fedoraproject.org/koji/buildinfo?buildID=1045333
7.2.1 - Fix for the "runq" command on Linux 4.14 and later kernels that
contain commit cd9e61ed1eebbcd5dfad59475d41ec58d9b64b6a, titled
"rbtree: cache leftmost node internally". Without the patch,
the command fails with the error message "runq: invalid structure
member offset: cfs_rq_rb_leftmost".
(anderson@redhat.com)
- Fix to prevent a useless message during session inialization.
Without the patch, if the highest possible node bit in the
node_states[N_ONLINE] multi-word bitmask is set, then a message
such as "crash: next_online_node: 256 is too large!" will be
displayed.
(anderson@redhat.com)
- Additional fixes for the ARM64 "bt" command for Linux 4.14 kernels.
The patch corrects the contents of in-kernel exception frame register
dumps, and properly transitions the backtrace from the IRQ stack
to the process stack.
(takahiro.akashi@linaro.org)
- Implemented a new "search -T" option, which is identical to the
"search -t" option, except that the search is restricted to the
kernel stacks of active tasks.
(atomlin@redhat.com)
- Removal of the ARM64 "bt -o" option for Linux 4.14 and later kernels,
along with several cleanups/readability improvements.
(takahiro.akashi@linaro.org)
- Fix for support of KASLR enabled kernels captured by the SADUMP
dumpfile facility. SADUMP dumpfile headers do not contain phys_base
or VMCOREINFO notes, so without this patch, the crash session fails
during initialization with the message "crash: seek error: kernel
virtual address: <address> type: "page_offset_base". This patch
calculates the phys_base value and the KASLR offset using the IDTR
and CR3 registers from the dumpfile header.
(indou.takao@jp.fujitsu.com)
- Implemented a new "ps -y policy" option to filter the task display
by scheduling policy. Applicable to both standalone ps invocation
as well as via foreach.
(oleksandr@redhat.com)
- Fix for the "kmem -[sS]" options on Linux 4.14 and later kernels that
contain commit 2482ddec670fb83717d129012bc558777cb159f7, titled
"mm: add SLUB free list pointer obfuscation". Without the patch,
there will numerous error messages of the type "kmem: <cache name>
slab: <address> invalid freepointer: <obfuscated address>" if
the kernel is configured with CONFIG_SLAB_FREELIST_HARDENED.
(anderson@redhat.com)
- Fix for the validation of the bits located in the least signficant
bits of mem_section.section_mem_map pointers. Without the patch,
the validation functions always returned valid, due to a coding
error found by clang. However, it was never really a problem
because it is extremely unlikely that an existing mem_section would
ever be invalid.
(oleksandr@redhat.com, anderson@redhat.com)
- Fix for the x86_64 kernel virtual address to physical address
translation mechanism. Without the patch, when verifying that the
PAGE_PRESENT bit is set in the top-level page table, it would always
test positively, and the translation would continue parsing the
remainder of the page tables. This would virtually never be a
problem in practice because if the top-level page table entry
existed, its PAGE_PRESENT bit would be set.
(oleksandr@redhat.com, anderson@redhat.com)
- Removed a check for a negative block_size value which is always a
non-negative unsigned value in the SADUMP header parsing function.
(oleksandr@redhat.com)
- Removed a check for an impossible negative value when calculating
the beginning address when applying the context value specified by
the "search -x <count>" option.
(oleksandr@redhat.com)
- Implemented a new "timer -C <cpu-specifier>" option that restricts
the timer or hrtimer output to the timer queue data associated with
one or more cpus. For multiple cpus, the cpu-specifier uses the
standard comma or dash separated list format.
(oleksandr@redhat.com)
- Fix for a "ps -l" regression introduced by the new "ps -y" option
introduced above. Without the patch, the -l option generates a
segmentation violation if not accompanied by a -C cpu specifier
option.
(vinayakm.list@gmail.com)
- Fix for the "kmem -i" and "kmem -V" options in Linux 4.8 and later
kernels containing commit 75ef7184053989118d3814c558a9af62e7376a58,
titled "mm, vmstat: add infrastructure for per-node vmstats".
Without the patch, the CACHED line of "kmem -i" shows 0, and the
VM_STAT section of "kmem -V" is missing entirely.
(vinayakm.list@gmail.com)
- Fix for Linux 4.11 and later kernels that contain kernel commit
4b3ef9daa4fc0bba742a79faecb17fdaaead083b, titled "mm/swap: split
swap cache into 64MB trunks". Without the patch, the CACHED line
of "kmem -i" may show nonsensical data.
(vinayakm.list@gmail.com)
- Implemented a new "dev -D" option that is the same as "dev -d", but
filters out the display of disks that have no I/O in progress.
(oleksandr@redhat.com)
- If a line number request for a module text address initially fails,
force the embedded gdb module to complete its two-stage strategy
used for reading debuginfo symbol tables from module object files,
and then retry the line number extraction. This automatically does
what the "mod -r" or "crash --readnow" options accomplish.
(anderson@redhat.com)
- Update for support of Linux 4.12 and later PPC64 kernels where the
hash page table geometry accomodates a larger virtual address range.
Without the patch, the virtual-to-physical translation of user space
virtual addresses by "vm -p", "vtop", and "rd -u" may generate an
invalid translation or otherwise fail.
(hbathini@linux.vnet.ibm.com)
- Implemented a new "runq -T" option that displays the time lag of each
CPU relative to the most recent runqueue timestamp.
(oleksandr@redhat.com)
- Fix to support Linux 4.15 and later kernels that contain kernel
commit e8cfbc245e24887e3c30235f71e9e9405e0cfc39, titled "pid: remove
pidhash". The kernel's traditional usage of a pid_hash[] array to
store PIDs has been replaced by an IDR radix tree, requiring a new
crash plug-in function to gather the system's task set. Without the
patch, the crash session fails during initialization with the error
message "crash: cannot resolve init_task_union".
(anderson@redhat.com)
- Fix for the "net" command when the network device listing has an
unusually large number of IP addresses. In that case, without the
patch, the command may generate a segmentation violation.
(k-hagio@ab.jp.nec.com)
- Fix for Linux 4.15 and later kernels that are configured with
CONFIG_SPARSEMEM_EXTREME, and that contain kernel commit
83e3c48729d9ebb7af5a31a504f3fd6aff0348c4, titled "mm/sparsemem:
Allocate mem_section at runtime for CONFIG_SPARSEMEM_EXTREME=y".
Without the patch, kernels configured with SPARSEMEM_EXTREME
have changed the data type of "mem_section" from an array to
a pointer, leading to errors in commands such as "kmem -p",
"kmem -n", "kmem -s", and any other command that translates a
physical address to its page struct address.
(anderson@redhat.com)
- With the latest PPC64 NMI IPI changes, crash_ipi_callback is found
multiple times on the stack of active non-panic tasks. Ensure that
the symbol reference relates to an actual backtrace stack frame.
(hbathini@linux.vnet.ibm.com)
- Update the starting virtual address of vmalloc space for kernels
configured with CONFIG_X86_5LEVEL.
(douly.fnst@cn.fujitsu.com)
- Update the X86_64 VSYSCALL_END address to reflect that it only
contains 1 page.
(douly.fnst@cn.fujitsu.com)
- Prevent the X86_64 FILL_PML() macro from updating the internal
machdep->machspec->last_pml4_read address every time a vmalloc'd
kernel virtual address is translated.
(douly.fnst@cn.fujitsu.com)
- Fix for the "bt" command in x86_64 kernels that contain, or have
backports of, kernel commit 4950d6d48a0c43cc61d0bbb76fb10e0214b79c66,
titled "x86/dumpstack: Remove 64-byte gap at end of irq stack".
Without the patch, backtraces fail to transition from the IRQ stack
back to the process stack, showing an error message such as
"bt: cannot transition exception stack to IRQ stack to current
process stack".
(anderson@redhat.com)
- Initial pass for support of kernel page table isolation. The x86_64
"bt" command may indicate "bt: cannot transition from exception stack
to current process stack" if the crash callback NMI occurred while an
active task was running on the new entry trampoline stack. This has
only been tested on the RHEL7 backport of the upstream patch because
as of this commit, crash does not run on 4.15-rc kernels. Further
changes may be required for upstream kernels, and distributions that
implement the kernel changes differently than upstream.
(anderson@redhat.com)
- Fix for the "bt" command and the "ps -s" option for zombie tasks
whose kernel stacks have been freed/detached. Without the patch,
the "bt" command indicates "bt: invalid kernel virtual address: 0
type: stack contents" and "bt: read of stack at 0 failed"; it will
be changed to display "(no stack)". The "ps -s" option would fail
prematurely upon reaching such a task, indicating "ps: invalid kernel
virtual address: 0 type: stack contents" and "ps: read of stack at 0
failed".
(anderson@redhat.com)
- Fix for running on live systems on 4.15-rc2 and later kernels that
are configured with CONFIG_RANDOMIZE_BASE and contain kernel commit
668533dc0764b30c9dd2baf3ca800156f688326b, titled "kallsyms: take
advantage of the new '%px' format". Without the patch, a live crash
session does not show the "WARNING: kernel relocated ..." message
expected with KASLR, and then displays the message "crash: cannot set
context for pid: <pid>" prior to generating a SIGSEGV.
(anderson@redhat.com)
- Fix for 4.15-rc5 and later x86_64 kernels that contain kernel commit
c482feefe1aeb150156248ba0fd3e029bc886605, titled "x86/entry/64: Make
cpu_entry_area.tss read-only". Without the patch, the addresses and
sizes of the x86_64 exception stacks cannot be determined; therefore
if a backtrace starts on one of the exception stacks, then the "bt"
command will fail.
(anderson@redhat.com)
- Additional fix for support of KASLR enabled kernels captured by the
SADUMP dumpfile facility, where this patch fixes a problem when Page
Table Isolation(PTI) is enabled. When PTI is enabled, bit 12 of CR3
register is used to split user space and kernel space. Also bit 11:0
is used for Process Context IDentifiers(PCID). To open an SADUMP
dumpfile, the value of CR3 is used to calculate KASLR offset and
phys_base; this patch masks the CR3 register value correctly for
a PTI enabled kernel.
(indou.takao@jp.fujitsu.com)
- Second phase of future support for x86_64 5-level page tables. This
patch is a cleanup/collaboration of the original logic used by the
various vtop functions, where several new common functions have been
added for extracting page table entries from PGD, P4D, PUD, PMD and
PTE pages. The usage of the former PML4 and UPML pages have been
replaced with the use of the common PGD page, and use the PUD page
in 4-level page table translation. Support for 5-level page tables
has been incorporated into the the existing x86_64_kvtop() and
x86_64_uvtop_level4() functions. Backwards compatibility for older
legacy kernels has been maintained. The third phase of support will
automatically detect whether the kernel proper, and whether an
individual user task, is utilizing 5-level page tables. This patch
enables support for kernel-only 5-level page tables by entering the
command line option "--machdep vm=5level".
(douly.fnst@cn.fujitsu.com)
- Xen commit 615588563e99a23aaf37037c3fee0c413b051f4d (Xen 4.0.0.)
extended the direct mapping to 5 TB. This area was previously
reserved for future use, so it is OK to simply change the upper
bound unconditionally.
(ptesarik@suse.com)
- Add a new "foreach gleader" qualifier option, restricting the output
to user-space tasks that are thread group leaders.
(Jan.Karlsson@sony.com)
- Since Xen commit 666aca08175b ("sched: use the auto-generated list of
schedulers") crash cannot open Xen vmcores because the "schedulers"
symbol no longer exists. Xen 4.7 implemented schedulers as its own
section in "xen/arch/x86/xen.lds.S", delimited by the two symbols
"__start_schedulers_array" and "__end_schedulers_array". Without
the patch, the crash session fails during initialization with the
error message "crash: cannot resolve schedulers"
(npajkovsky@suse.cz)
- Fix the sample crash.ko memory driver to prevent an s390X kernel
addressing exception. Legitimate pages of RAM that successfully
pass the page_is_ram() and pfn_valid() verifier functions may not
be provided by the s390x hypervisor, and the memcpy() from the
non-existent memory to the bounce buffer panics the kernel. The
patch replaces the the memcpy() call with probe_kernel_read().
(anderson@redhat.com)
- Fix for the ARM64 "bt" command running against Linux 4.14 and
later kernels. Without the patch, the backtraces of the active
tasks in a kdump-generated dumpfile are truncated. Without the
patch, the panic task will just show the "crash_kexec" frame
and the kernel-entry user-space exception frame; the non-panic
tasks will show their backtraces starting from the stackframe
addresses captured in the per-cpu NT_PRSTATUS notes, and will
not display the exception frame generated by the NMI callback,
nor any stackframes on the IRQ stack.
(anderson@redhat.com)
- Fix for the ARM64 "bt" command in kernels that contain commit
30d88c0e3ace625a92eead9ca0ad94093a8f59fe, titled "arm64: entry:
Apply BP hardening for suspicious interrupts from EL0". Without
the patch, there may be invalid kernel kernel exception frames
displayed on an active task's kernel stack, often below a stackframe
of the "do_el0_ia_bp_hardening" function; the address translation
of the PC and LR values in the the bogus exception frame will
display "[unknown or invalid address]".
(anderson@redhat.com)
(02/13/18)
7.2.0-1.fc28
Information for build crash-7.2.0-1.fc28
https://koji.fedoraproject.org/koji/buildinfo?buildID=978501
7.2.0 - Fix for the "snap.so" extension module to pass the KASLR relocation
offset value in the ELF header for x86_64 kernels that are compiled
with CONFIG_RANDOMIZE_BASE. Without the patch, it is necessary to
use the "--kaslr=<offset>" command line option, or the session
fails with the message "WARNING: cannot read linux_banner string",
followed by "crash: vmlinux and vmcore do not match!".
(anderson@redhat.com)
- The native gdb "disassemble" command fails if the kernel has been
compiled with CONFIG_RANDOMIZE_BASE because the embedded gdb module
still operates under the assumption that the (non-relocated) text
locations in the vmlinux file are correct. The error message that
is issued is somewhat confusing, indicating "No function contains
specified address". This patch simply clarifies the error message
to indicate "crash: the gdb "disassemble" command is prohibited
because the kernel text was relocated by KASLR; use the crash "dis"
command instead."
(anderson@redhat.com)
- Fix for the "mach -m" command in Linux 4.9 and later kernels that
contain commit 475339684ef19e46f4702e2d185a869a5c454688, titled
"x86/e820: Prepare e280 code for switch to dynamic storage", in
which the "e820" symbol was changed from a static e820map structure
to a pointer to an e820map structure. Without the patch, the
command either displays just the header, or the header with several
nonsensical entries.
(anderson@redhat.com)
- Fix for Linux 4.10 and later kdump dumpfiles, or kernels that have
backported commit 401721ecd1dcb0a428aa5d6832ee05ffbdbffbbe, titled
"kexec: export the value of phys_base instead of symbol address".
Without the patch, if the x86_64 "phys_base" value in the VMCOREINFO
note is a negative decimal number, the crash session fails during
session intialization with a "page excluded" or "seek error" when
reading "page_offset_base".
(anderson@redhat.com)
- Fix for the PPC64 "pte" command. Without the patch, if the target
PTE references a present page, the physical address is incorrect.
(anderson@redhat.com)
- Fix for a 32-bit MIPS compilation error if glibc-2.25 or later has
been installed on the host build machine. Without the patch, the
build fails with the error message "mips-linux-nat.c:157:1: error:
conflicting types for 'ps_get_thread_area'".
(dengke.du@windriver.com)
- Fix for the validity check of S390X virtual addresses for 5-level
page tables where user space memory is mapped above 8 Petabytes.
Without the patch, "rd -u" fails and indicates "invalid user virtual
address", and "vtop -u" indicates that the address is "(not mapped)".
(zaslonko@linux.vnet.ibm.com)
- Crash 7.1.5 commit c3413456599161cabc4e910a0ae91dfe5eec3c21 (xen: Add
support for dom0 with Linux kernel 3.19 and newer) from Daniel Kiper
implemented support for Xen dom0 vmcores after Linux 3.19 kernel
commit 054954eb051f35e74b75a566a96fe756015352c8 (xen: switch to
linear virtual mapped sparse p2m list). This patch can be deemed
subsequent to Daniel's patch, and implements support Xen PV domU
dumpfiles for Linux 3.19 and later kernels.
(honglei.wang@oracle com)
- Fix for the "dis" command to detect duplicate symbols in the case
of a "symbol+offset" argument where the duplicates are not contiguous
in the symbol list. Without the patch, the first of multiple symbol
instances is used in the address evaluation. With the patch, the
command will fail with the error message "dis: <symbol+offset>:
duplicate text symbols found:", followed by a list of the duplicate
symbols, and their file and line numbers if available.
(anderson@redhat.com)
- Enhancement to the error reporting mechanism for the "kmem -[sS]"
options. When a fatal error is encountered while gathering basic
CONFIG_SLUB statistics, it is possible that the slab cache name
is not displayed in the error message, and the line containing
the slab cache name, address, etc., is not displayed at all. With
this patch, an extra error message indicating "kmem: <cache-name>:
cannot gather relevant slab data" will be displayed under the
fatal error message; and under that, the CACHE address, cache NAME,
OBJSIZE, and SSIZE columns will be displayed, but with "?" under
the ALLOCATED, TOTAL, and SLABS columns.
(anderson@redhat.com)
- Fix to prevent the "tree -t radix" option from failing when it
encounters duplicate entries in a radix_tree_node[slots] array.
Without the patch, if a duplicate slot entry is found, the command
fails with the message "tree: duplicate tree entry: radix_tree_node:
<node address> slots[<index>]: <entry>\n". (The error can
be prevented if the command is preceded by "set hash off".) However,
certain radix trees contain duplicate entries by design, such as the
"pgmap_radix" radix tree, in which a radix_tree_node may contain
multiple instances of the same page_map structure. With the patch,
checks will only be made for duplicate radix_tree_node structures.
(anderson@redhat.com)
- First phase of future support for x86_64 5-level page tables. New
sets of virtual memory offsets have been #define'd and helper macros
and placeholder functions for the p4d page tables have been added.
The only functional changes with this patchset are dynamically-set
PGDIR_SHIFT and PHYSICAL_MASK_SHIFT values that are based upon the
kernel configuration.
(anderson@redhat.com)
- Fix for a build failure. Without the patch, if the build is done by
a user whose username cannot be determined from the user ID number,
the build fails immediately with a segmentation fault.
(sargun@sargun.me, anderson@redhat.com)
- Fix for Linux 4.13-rc0 commit 7fd8329ba502ef76dd91db561c7aed696b2c7720
"x86/boot/64: Rename init_level4_pgt and early_level4_pgt". Without
the patch, the crash session fails during initialization with the
error message "crash: cannot resolve "init_level4_pgt".
(anderson@redhat.com)
- The internal "build_data" string contains the compile-time date,
the user id of the builder, and the build machine hostname, and is
viewable by the "crash --buildinfo" command line option or by the
"help -B" option during runtime. This patch replaces that string
data with "reproducible build" if the SOURCE_DATE_EPOCH environment
variable contains a value string when the crash binary is compiled.
(bwiedemann@suse.de)
- Fix for Linux 4.13-rc1 commit 2d070eab2e8270c8a84d480bb91e4f739315f03d
"mm: consider zone which is not fully populated to have holes".
Without the patch, SPARSEMEM page struct addresses are incorrectly
calculated because a new section state, and an associated flag bit,
has been added to the low bits of the mem_section.section_mem_map
address; the extra bit is erroneously passed back as part of the
section_mem_map and resultant page struct address, leading to
errors in commands such as "kmem -p", "kmem -s", "kmem -n", and any
other command that translates a physical address to its page struct
address.
(anderson@redhat.com)
- Enhancement to the S390X "vtop" command to display page table walk
information, adding output showing the following page table contents:
"Region-First-Table Entry" (RFTE)
"Region-Second-Table Entry" (RSTE)
"Region-Third-Table Entry" (RTTE)
"Segment Table Entry" (STE)
"Page Table Entry" (PTE)
"Read address of page" (PAGE)
Depending on the size of the address space, the page tables can start
at different levels. For example:
crash> vtop 3ff8000c000
VIRTUAL PHYSICAL
3ff8000c000 2e3832000
PAGE DIRECTORY: 0000000000aaa000
RTTE: 0000000000aadff8 => 00000002e3c00007
STE: 00000002e3c00000 => 00000002e3df7000
PTE: 00000002e3df7060 => 00000002e383203d
PAGE: 00000002e3832000
PAGE PHYSICAL MAPPING INDEX CNT FLAGS
3d10b8e0c80 2e3832000 0 0 1 7fffc0000000000
(holzheu@linux.vnet.ibm.com)
- Fix the s390dbf time stamps for S390X kernel versions 4.11 and 4.14.
With kernel commit ea417aa8a38bc7db ("s390/debug: make debug event
time stamps relative to the boot TOD clock") for s390dbf time is
stored relative to the kernel boot time. In order to still show
absolute time since 1970 we have to detect those kernels and re-add
the boot time before printing the records. We can use the
tod_to_timeval() symbol to check for those kernels because the
patch has removed the symbol. With kernel commit 6e2ef5e4f6cc5734
("s390/time: add support for the TOD clock epoch extension")
the symbol name for storing the boot time has changed from
"sched_clock_base_cc" to "tod_clock_base". This commit is currently
on the s390 features branch and will be integrated in Linux 4.14.
(holzheu@linux.vnet.ibm.com)
- Further enhancement to the S390X "vtop" command to translate the
binary values of the hardware flags for region, segment and page
table entries. For example:
crash> vtop -u 0x60000000000000
VIRTUAL PHYSICAL
60000000000000 5b50a000
PAGE DIRECTORY: 000000005cea0000
RFTE: 000000005cea0018 => 000000006612400f (flags = 00f)
flags in binary : P=0; TF=00; I=0; TT=11; TL=11
RSTE: 0000000066124000 => 000000005d91800b (flags = 00b)
flags in binary : P=0; TF=00; I=0; TT=10; TL=11
RTTE: 000000005d918000 => 000000006615c007 (flags = 007)
flags in binary : FC=0; P=0; TF=00; I=0; CR=0; TT=01; TL=11
STE: 000000006615c000 => 000000005ce48800 (flags = 800)
flags in binary : FC=0; P=0; I=0; CS=0; TT=00
PTE: 000000005ce48800 => 000000005b50a03f (flags = 03f)
flags in binary : I=0; P=0
PAGE: 000000005b50a000
or for large pages:
crash> vtop -k 0x3d100000000
VIRTUAL PHYSICAL
3d100000000 77c00000
PAGE DIRECTORY: 0000000001210000
RTTE: 0000000001213d10 => 0000000077dc4007 (flags = 007)
flags in binary : FC=0; P=0; TF=00; I=0; CR=0; TT=01; TL=11
STE: 0000000077dc4000 => 0000000077c03403 (flags = 03403)
flags in binary : AV=0, ACC=0011; F=0; FC=1; P=0; I=0; CS=0; TT=00
(zaslonko@linux.vnet.ibm.com)
- PPC64 kernel commit 2f18d533757da3899f4bedab0b2c051b080079dc lowered
the max real address to 53 bits. Without this patch, the warning
message "WARNING: cannot access vmalloc'd module memory" appears
during initialization, and any command that attempts to read a
vmalloc'd kernel virtual address will fail and display "read error"
messages.
(hbathini@linux.vnet.ibm.com)
- Display the KASLR relocation value warning message whenever it is
in use. Without the patch, the message may not get displayed
if the --kaslr option is used, or if the dumpfile is a vmcore
generated by the current snap.so extension module, which now
exports the relocation value in the header.
(anderson@redhat.com)
- Fix to prevent an initialization-time failure when running a live
session on a host system that does not have a "/usr/src" directory.
Without the patch, the session fails with the message "*** Error in
'crash': free(): invalid pointer: <address> ***".
(Lei Chen)
- Fix for the ARM64 "bt" command's display of the user mode exception
frame at the top of the stack in Linux 4.7 and later kernels.
Without the patch, the contents of the user mode exception frame are
invalid due to the miscalculation of the starting address of the
pt_regs structure on the kernel stack.
(anderson@redhat.com)
- Integrated support for usage of the Linux 4.14 ORC unwinder by the
x86_64 "bt" command. Kernels configured with CONFIG_ORC_UNWINDER
contain .orc_unwind and .orc_unwind_ip sections that can be queried
to determine the stack frame size of any text address within a kernel
function. For kernels not configured with CONFIG_FRAME_POINTER,
the crash utility does frame size calculation by disassembling a
function from its beginning to the specified text address, counting
the push, pop, and add/sub rsp instructions, accounting for retq
instructions that occur in the middle of a function. With this patch,
access to the new ORC sections has been plugged into the existing
frame size calculator, resulting in a more efficient and accurate
manner of determining frame sizes, and as a result, more accurate
backtraces.
(anderson@redhat.com)
- Fix for the ARM64 "bt" command when run against Linux 4.14-rc1.
Without the patch, a message indicating "crash: builtin stackframe.sp
offset incorrect!" is issued during session initialization, and the
"bt" command fails with the error message "bt: invalid structure
member offset: task_struct_thread_context_sp".
(anderson@redhat.com)
- For for the "task -R <member>" option on Linux 4.13 and later kernels
where the task_struct contains a "randomized_struct_fields_start" to
"randomized_struct_fields_end" section. Without the patch, a member
argument that is inside the randomized section is not found.
(anderson@redhat.com)
- Fix for the "snap.so" extension module to pass the value of the ARM64
"kimage_voffset" value in the ELF header. Without the patch, it is
necessary to use the "--machdep kvimage_offset=<value>" command line
option, or the session fails with the message "crash: vmlinux and
vmcore do not match!".
(anderson@redhat.com)
(09/29/17)
7.1.9-1.fc27
Information for build crash-7.1.9-1.fc27
https://koji.fedoraproject.org/koji/buildinfo?buildID=882902
7.1.9-1.fc26
Information for build crash-7.1.9-1.fc26
https://koji.fedoraproject.org/koji/buildinfo?buildID=883090
7.1.9-1.fc25
Information for build crash-7.1.9-1.fc25
https://koji.fedoraproject.org/koji/buildinfo?buildID=883098
7.1.9 - Fixes to address three gcc-7.0.1 compiler warnings that are generated
when building with "make warn". The warning types are "[-Wnonnull]"
in filesys.c, and "[-Wformat-overflow=]" in kernel.c and cmdline.c.
(anderson@redhat.com)
- Fix for the PPC64 "mach -o" option to update the OPAL console buffer
size from 256K to 1MB, based upon the latest skiboot firmware source.
(ankit@linux.vnet.ibm.com)
- Fix for the "mod -[sS]" option to prevent the erroneous reassignment
of one or more symbol values of a kernel module. Without the patch,
when loading a kernel module, a message may indicate "mod: <module>:
last symbol: <symbol> is not _MODULE_END_<module>?" may be displayed,
and one or more symbols may be reassigned an incorrect symbol value.
If none of the erroneous symbol value reassignments are beyond the
end of the module's address space, then there will be no message.
(anderson@redhat.com)
- Linux 4.10 commit 401721ecd1dcb0a428aa5d6832ee05ffbdbffbbe finally
exports the x86_64 "phys_base" value in the VMCOREINFO note, so
utilize it whenever it exists.
(anderson@redhat.com)
- Implemented a new "log -a" option that dumps the audit logs remaining
in kernel audit buffers that have not been copied out to the
user-space audit daemon.
(d.hatayama@jp.fujitsu.com)
- Fix for the "kmem <address>" option and the "search" command
in x86_64 kernels that contain, or have backports of, kernel commit
7c1da8d0d046174a4188b5729d7579abf3d29427, titled "crypto: sha - SHA1
transform x86_64 AVX2", which introduced an "_end" text symbol.
Without the patch, if a base kernel symbol address that is larger
than the "_end" text symbol is passed to "kmem <address>", its
symbol/filename information will not be displayed. Also, when the
"search" command scans the __START_KERNEL_map region that contains
kernel text and static data, the search will be truncated to stop at
the "_end" text symbol address.
(anderson@redhat.com)
- Enhancement for the determination of the ARM64 "kimage_voffset" value
in Linux 4.6 and later kernels if an ELF format dumpfile does not
contain its value in a VMCOREINFO note, or when running against
live systems using /dev/mem, /proc/kcore, or an older version of
/dev/crash.
(liyueyi@live.com)
- Optimization of the "kmem -f <address>" and "kmem <r;address>" options
to signficantly reduce the amount of time to complete the buddy
allocator free-list scan for the target address. On very large
memory systems, the patch may reduce the time spent by several orders
of magnitude.
(anderson@redhat.com)
- Fix for a compilation error if glibc-2.25 or later has been installed
on the host build machine. Without the patch, the build fails with
the error message "amd64-linux-nat.c:496:1: error: conflicting types
for 'ps_get_thread_area'".
(anderson@redhat.com)
- Fix for the "list -[hH]" options if a list_head.next pointer is
encountered that contains an invalid NULL pointer. Without the
patch, the "list -[hH]" options would complete/continue as if the
NULL were a legitimate end-of-list indicator, and no error would be
reported.
(rabin.vincent@axis.com)
- Provide basic Huge Page usage as part of "kmem -i" output, showing
the total amount of memory allocated for huge pages, and the amount
of the total that is free.
(atomlin@redhat.com)
- Fix for the determination of the x86_64 "phys_base" value when it is
not passed in the VMCOREINFO data of ELF vmcores. Without the patch,
it is possible that the base address of the vmalloc region is unknown
and initialized to an incorrect default address during the very early
stages of initialization, which causes the parsing of the PT_LOAD
segments for the START_KERNEL_map region to fail.
(anderson@redhat.com)
- Fix for the "dis" command to detect duplicate symbols in the case
of a "symbol+offset" argument where the duplicates are contiguous
in the symbol list. In addition, reject "symbol+offset" arguments
if the resultant address goes beyond the end of the function.
(anderson@redhat.com)
- Fix for the "set scope" option if the kernel was configured with
CONFIG_RANDOMIZE_BASE. Without the patch, the command fails with
the message "set: gdb cannot find text block for address: <symbol>".
This also affects extension modules that call gdb_set_crash_scope()
when running with KASLR kernels.
(anderson@redhat.com)
- Fix for the extensions/trace.c extension module to account for
Linux 4.7 kernel commit 9b94a8fba501f38368aef6ac1b30e7335252a220,
which changed the ring_buffer_per_cpu.nr_pages member from an int
to a long. Without the patch, the trace.so extension module fails
to load on big-endian machines, indicating "extend: Num of pages
is less than 0".
(feij.fnst@cn.fujitsu.com)
- Fix for the extensions/trace.c extension module when running on
the ppc64 architecture. Without the patch, the trace.so extension
module fails to load, indicating "extend: invalid text address:
ring_buffer_read". On the ppc64 architecture, the text symbol
is ".ring_buffer_read".
(anderson@redhat.com)
- Fix for the ARM64 "bt" command. Without the patch, the backtrace of
a non-panicking active task generates a segmentation violation when
analyzing Android 4.4-based dumpfiles.
(zhizhouzhang@asrmicro.com)
(04/20/17)
7.1.8-1.fc26
Information for build crash-7.1.8-1.fc26
https://koji.fedoraproject.org/koji/buildinfo?buildID=861562
7.1.8 - Fix for Linux 4.6 commit b03a017bebc403d40aa53a092e79b3020786537d,
which introduced the new slab management type OBJFREELIST_SLAB.
In this mode, the freelist can be an object, and if the slab is full,
there is no freelist. On the next free, an object is recycled to be
used as the freelist but not cleaned-up. This patch will go through
only known freed objects, and will prevent "kmem -S" errors that
indicate "invalid/corrupt freelist entry" on kernels configured
with CONFIG_SLAB.
(thgarnie@google.com)
- Fix for the initialization-time loading of kernel module symbols
if the kernel crashed while running a module's initcall. Without
the patch, the crash session fails during initialation with a message
similar to "crash: store_module_symbols_v2: total: 7 mcnt: 8".
(rabinv@axis.com)
- Fix for a segmentation violation during session inialization when
running against a 32-bit MIPS ELF kdump or compressed kdump if a
per-cpu NT_PRSTATUS note cannot be be gathered from the dumpfile
header. Without the the patch, a segmentation violation occurs after
the message "WARNING: cannot find NT_PRSTATUS note for cpu: <number>"
is displayed.
(rabinv@axis.com)
- The 32-bit MIPS PGD_ORDER() macro expects __PGD_ORDER to be signed,
which it isn't now since the internal machdep->pagesize is unsigned.
Without this patch, module loading fails during initialization on a
kernel that has a page size of 16KB, with messages that indicate
"please wait... (gathering module symbol data)" followed by
"crash: invalid size request: 0 type: pgd page".
(rabinv@axis.com)
- For ARM64 dumpfiles with VMCOREINFO, verify the new "VA_BITS" number
against the calculated number.
(anderson@redhat.com)
- Fix for the ARM64 "bt" command in Linux 4.10 and later kernels that
are configured with CONFIG_THREAD_INFO_IN_TASK. Without the patch,
the "bt" command will fail for active tasks in dumpfiles that were
generated by the kdump facility.
(takahiro.akashi@linaro.org)
- Fix for Linux 4.10 commit 7fd8329ba502ef76dd91db561c7aed696b2c7720
"taint/module: Clean up global and module taint flags handling".
Without the patch, when running against Linux 4.10-rc1 and later
kernels, the crash utility fails during session initialization with
the message "crash: invalid structure size: tnt".
(panand@redhat.com)
- Fix for support of /proc/kcore as the live memory source in Linux 4.8
and later x86_64 kernels configured with CONFIG_RANDOMIZE_BASE, which
randomizes the unity-mapping PAGE_OFFSET value. Without the patch,
the crash session fails during session initialization with the error
message "crash: seek error: kernel virtual address: <address>
type: page_offset_base".
(anderson@redhat.com)
- Update to the module taint flags handling patch above to account for
the change in size of the module.taints flag from an int to a long,
while allowing for a kernel backport that keeps it as an int.
(anderson@redhat.com)
- Prepare for the kernel's "taint_flag.true" and "taint_flag.false"
member names to be changed to "c_true" and "c_false", which fixes
build problems when an out-of-tree module defines "true" or "false".
(anderson@redhat.com)
- Prevent the livepatch taint flag check during the system banner
display from generating a fatal session-killing error if relevant
kernel symbol names or data structures change in the future (again).
(anderson@redhat.com)
- Fix for the PPC64 "bt" command for non-panicking active tasks in
FADUMP-generated dumpfiles (Firmware Assisted Dump facility).
Without the patch, backtraces of those tasks may be of the form
"#0 [c0000000700b3a90] (null) at c0000000700b3b50 (unreliable)".
This patch uses and displays the ptregs register set saved in the
dumpfile header for the non-panicking active tasks.
(hbathini@linux.vnet.ibm.com)
- Fix for a possible segmentation violation when analyzing Linux 4.6
and earlier x86_64 kernels configured with CONFIG_RANDOMIZE_BASE.
A segmentation violation may occur during session initialization,
just after the patching of the gdb minimal_symbol values message,
depending upon the value of KERNEL_IMAGE_SIZE, which was variable
in the earlier KASLR kernels. This patch sets the KERNEL_IMAGE_SIZE
default value to 1GB for those earlier kernels, and also adds a
new "--machdep kernel_image_size=<value>" option that can be
used to override the default KERNEL_IMAGE_SIZE value if necessary.
(anderson@redhat.com)
- Fix the bracketing of the x86_64 FILL_PML4() macro.
(anderson@redhat.com)
- Fix for the "tree -t radix", "irq", and "files -p" command options
in Linux 4.6 and later kernels due to upstream changes in the radix
tree facility. Without the patch, the commands will fail with the
message "radix trees do not exist or have changed their format".
(hirofumi@mail.parknet.co.jp)
- Fix for the "trace.c" extension module. The kernel buffer referenced
by "max_tr_ring_buffer" is not available in all configurations of the
kernel so the unitialized max_tr_ring_buffer variable should not be
used. A similar check existed previously before the recent rework of
the trace extension module to support multiple buffers.
(rabinv@axix.com)
- Clarification in the display of CONFIG_SLUB object addresses that are
displayed by the "kmem" command when SLAB_RED_ZONE has been enabled.
By default, CONFIG_SLUB object addresses that are displayed by the
"kmem" command will point to the SLAB_RED_ZONE padding inserted at
the beginning of the object. As an alternative, a new "redzone"
environment variable has been addedd that can be toggled on or off.
If "set redzone off" is entered, the object addresses will point to
the address that gets returned to the allocator.
(hirofumi@mail.parknet.co.jp, anderson@redhat.com)
- Fix for the "CURRENT" value displayed by the "timer -r" command.
Without the patch, if the target machine has been up for a long
enough time, an arithmetic overflow will occur and the time value
displayed will be incorrect.
(shane.seymour@hpe.com)
- Fix for 32-bit X86 kernels configured with CONFIG_RANDOMIZE_BASE.
Without the patch, an invalid kernel PAGE_OFFSET value is calculated
and as a result the session fails during session initialization just
after the patching of the gdb minimal_symbol values message, showing
the warning message "WARNING: cannot read linux_banner string",
followed by "crash: vmlinux and /dev/crash do not match!". This
patch also adds a new "--machdep page_offset=<value>" option that
can be used if the CONFIG_PAGE_OFFSET value is not the default
address of 0xc0000000.
(anderson@redhat.com)
- Introduction of a new PPC64-only "mach -o" option that dumps the OPAL
"Open Power Abstraction Layer" console buffer.
(ankit@linux.vnet.ibm.com)
- Fix for the "bt" command on Linux 4.9 and later 32-bit X86 kernels
containing kernel commit 0100301bfdf56a2a370c7157b5ab0fbf9313e1cd,
subject "sched/x86: Rewrite the switch_to() code". Without the
patch, backtraces for inactive (sleeping) tasks fail with the message
"bt: invalid structure member offset: task_struct_thread_eip".
(anderson@redhat.com)
- Fix for a "[-Wmisleading-indentation]" compiler warning and the
associated bug that is generated by lkcd_x86_trace.c when building
32-bit X86 with "make warn" with gcc-6.3.1.
(anderson@redhat.com)
- Fix for an invalid "bt" warning on a 32-bit X86 idle/swapper task.
Without the patch, the backtrace displays the "cannot resolve stack
trace" warning, dumps the backtrace, and then the text symbols:
crash> bt
PID: 0 TASK: f0962180 CPU: 6 COMMAND: "swapper/6"
bt: cannot resolve stack trace:
#0 [f095ff1c] __schedule at c0b6ef8d
#1 [f095ff58] schedule at c0b6f4a9
#2 [f095ff64] schedule_preempt_disabled at c0b6f728
#3 [f095ff6c] cpu_startup_entry at c04b0310
#4 [f095ff94] start_secondary at c04468c0
bt: text symbols on stack:
[f095ff1c] __schedule at c0b6ef8d
[f095ff58] schedule at c0b6f4ae
[f095ff64] schedule_preempt_disabled at c0b6f72d
[f095ff6c] cpu_startup_entry at c04b0315
[f095ff94] start_secondary at c04468c5
crash>
The backtrace shown is actually correct.
(anderson@redhat.com)
- Another fix for a similar "bt: cannot resolve stack trace" warning
on a 32-bit X86 idle/swapper task, but when running on cpu 0.
(anderson@redhat.com)
- Remove two one-time warning messages that are displayed when running
the "bt" command on Linux 4.2 and later 32-bit X86 kernels. Without
the patch, the first "bt" command that is executed will be preceded
by "bt: WARNING: "system_call" symbol does not exist", followed by
"bt: WARNING: neither "ret_from_sys_call" nor "syscall_badsys"
symbols exist".
(anderson@redhat.com)
- Fix for Linux 3.15 and later 32-bit X86 kernels containing kernel
commit 198d208df4371734ac4728f69cb585c284d20a15, titled "x86: Keep
thread_info on thread stack in x86_32". Without the patch, incorrect
addresses of each per-cpu hardirq_stack and softirq_stack were saved
for usage by the "bt" command.
(hirofumi@mail.parknet.co.jp, anderson@redhat.com)
- Additional fix for Linux 3.15 and later 32-bit X86 kernels containing
kernel commit 198d208df4371734ac4728f69cb585c284d20a15, titled "x86:
Keep thread_info on thread stack in x86_32". The patch fixes the
stack transition symbol from "handle_IRQ" to "handle_irq" for usage
by the "bt" command.
(anderson@redhat.com)
- Fix for 32-bit X86 kernels to determine the active task in a dumpfile
in the situation where the task was running on its soft IRQ stack,
took a hard IRQ, and then the system crashed while it was running on
its hard IRQ stack.
(hirofumi@mail.parknet.co.jp)
- Allow the "--kaslr=<offset>" and/or "--kaslr=auto" command line
options to be used with the 32-bit X86 architecture.
(anderson@redhat.com)
- Removed -Werror from the bfd and opcode library builds.
(anderson@redhat.com)
(02/22/17)
7.1.7-1.fc26
Information for build crash-7.1.7-1.fc26
http://koji.fedoraproject.org/koji/buildinfo?buildID=823232
7.1.7-1.fc25
Information for build crash-7.1.7-1.fc25
http://koji.fedoraproject.org/koji/buildinfo?buildID=823252
7.1.7-1.fc24
Information for build crash-7.1.7-1.fc24
http://koji.fedoraproject.org/koji/buildinfo?buildID=823280
7.1.7 - Set the default 32-bit MIPS HZ value to 100 if the in-kernel config
data is unavailable, and have the "mach" command display the value.
(rabinv@axis.com)
- Enable SPARSEMEM support on 32-bit MIPS by setting SECTION_SIZE_BITS
and MAX_PHYSMEM_BITS.
(rabinv@axis.com)
- Fix for Linux 4.9-rc1 commits 15f4eae70d365bba26854c90b6002aaabb18c8aa
and c65eacbe290b8141554c71b2c94489e73ade8c8d, which have introduced a
new CONFIG_THREAD_INFO_IN_TASK configuration. This configuration
moves each task's thread_info structure from the base of its kernel
stack into its task_struct. Without the patch, the crash session
fails during initialization with the error "crash: invalid structure
member offset: thread_info_cpu".
(anderson@redhat.com)
- Fixes for the gathering of the active task registers from 32-bit MIPS
dumpfiles:
(1) If ELF notes are not available, read them from the kernel's
crash_notes.
(2) If an online CPUs did not save its ELF notes, then adjust
the mapping of each ELF note to its CPU accordingly.
(rabinv@axis.com)
- Add support for "help -r" on 32-bit MIPS to display the registers
for each CPU from a dumpfile.
(rabinv@axis.com)
- Fix for Linux 4.9-rc1 commit 0100301bfdf56a2a370c7157b5ab0fbf9313e1cd,
which rewrote the x86_64 switch_to() code by embedding the call to
__switch_to() inside a new __switch_to_asm() assembly code ENTRY()
function. Without the patch, the message "crash: cannot determine
thread return address" gets displayed during initialization, and the
"bt" command shows frame #0 starting at "schedule" instead of
"__schedule".
(anderson@redhat.com)
- When each x86_64 per-cpu cpu_tss.x86_tss.ist[] array member (or in
older kernels, each per-cpu init_tss.x86_hw_tss.ist[] array member),
is compared with its associated per-cpu orig_ist.ist[] array member,
ensure that both exception stack pointers have been initialized
(non-NULL) before printing a WARNING message if they don't match.
(anderson@redhat.com)
- Fix for a possible segmentation violation when analyzing Linux 4.7
x86_64 kernels that are configured with CONFIG_RANDOMIZE_BASE.
Depending upon the randomized starting address of the kernel text
and static data, a segmentation violation may occur during session
initialization, just after the patching of the gdb minimal_symbol
values message.
(anderson@redhat.com)
- Restore the x86_64 "dis" command's symbolic translation of jump
or call target addresses if the kernel was configured with
CONFIG_RANDOMIZE_BASE.
(anderson@redhat.com)
- Fix for the 32-bit MIPS "bt" command to prevent an empty display
(task header only) for an active task if the epc register in its
exception frame contains 00000000.
(rabinv@axis.com)
- Fix for support of Linux 4.7 and later x86_64 ELF kdump vmcores from
kernels configured with CONFIG_RANDOMIZE_BASE. Without the patch,
the crash session may fail during initialization with the message
"crash: vmlinux and vmcore do not match!".
(anderson@redhat.com)
- Fix for the x86_64 "mach" command display of the vmemmap base address
in Linux 4.9 and later kernels configured with CONFIG_RANDOMIZE_BASE.
Without the patch, the command shows a value of ffffea0000000000 next
to "KERNEL VMEMMAP BASE".
(anderson@redhat.com)
- Since the Linux 3.10 release, the kernel has offered the ability to
create multiple independent ftrace buffers. At present, however,
the "trace.c" extension module is only able to extract the primary
buffer. This patch refactors the trace.c extension module so that
the global instance is passed around as a parameter rather than
accessing it directly, and then locates all of the available
instances and extracts the data from each of them.
(kyle.a.tomsic@gmail.com)
- Fix for the s390x "bt" command for active tasks. Since the commit
above in this crash-7.1.7 release that added support for the new
CONFIG_THREAD_INFO_IN_TASK configuration, the backtrace of active
tasks can be incomplete.
(holzheu@linux.vnet.ibm.com)
- In collaboration with an update to the /dev/crash kernel driver, fix
for Linux 4.6 commit a7f8de168ace487fa7b88cb154e413cf40e87fc6, which
allows the ARM64 kernel image to be loaded anywhere in physical
memory. Without the patch, attempting to run live on an ARM64
Linux 4.6 and later kernel may display the warning message "WARNING:
cannot read linux_banner string", and then fails with the message
"crash: vmlinux and /dev/crash do not match!". Version 1.3 of the
crash driver is required, which introduces a new ioctl command that
retrieves the ARM64-only "kimage_voffset" value that is required for
virtual-to-physical address translation.
(anderson@redhat.com)
- Update of the sample memory_driver/crash.c /dev/crash kernel driver
to version 1.3, which adds support for Linux 4.6 and later ARM64
kernels, kernels configured with CONFIG_HARDENED_USERCOPY, and
S390X kernels use xlate_dev_mem_ptr() and unxlate_dev_mem_ptr()
instead of kmap() and kunmap().
(anderson@redhat.com)
(11/30/16)
7.1.6-1.fc26
- Fedora Rawhide build: crash-7.1.6-1.fc26 (10/14/16)
http://koji.fedoraproject.org/koji/buildinfo?buildID=810068
7.1.6 - Introduction of support for "live" ramdump files, such as those that
are specified by the QEMU mem-path argument of a memory-backend-file
object. This allows the running of a live crash session against a
QEMU guest from the host machine. In this example, the /tmp/MEM file
on a QEMU host represents the guest's physical memory:
$ qemu-kvm ...other-options... \
-object memory-backend-file,id=MEM,size=128m,mem-path=/tmp/MEM,share=on \
-numa node,memdev=MEM -m 128
and a live session run can be run against the guest kernel like so:
$ crash <path-to-guest-vmlinux> live:/tmp/MEM@0
By prepending the ramdump image name with "live:", the crash session will
act as if it were running a normal live session.
(oleg@redhat.com)
- Fix for the support of ELF vmcores created by the KVM "virsh dump
--memory-only" facility if the guest kernel was not configured with
CONFIG_KEXEC, or CONFIG_KEXEC_CORE in Linux 4.3 and later kernels.
Without the patch, the crash session fails during initialization with
the message "crash: cannot resolve kexec_crash_image".
(hirofumi@mail.parknet.co.jp)
- Added support for x86_64 ramdump files. Without the patch, the crash
session fails immediately with the message "ramdump: unsupported
machine type: X86_64".
(anderson@redhat.com)
- Fix for a "[-Werror=misleading-indentation]" compiler warning that
is generated by gdb-7.6/bfd/elf64-s390.c when building S390X in a
Fedora Rawhide environment with gcc-6.0.0
(anderson@redhat.com)
- Recognize and parse the new QEMU_VM_CONFIGURATION and QEMU_VM_FOOTER
sections used for live migration of KVM guests, which are seen in
the "kvmdump" format generated if "virsh dump" is used without the
"--memory-only" option.
(pagupta@redhat.com)
- Fix for Linux commit edf14cdbf9a0e5ab52698ca66d07a76ade0d5c46, which
has appended a NULL entry as the final member of the pageflag_names[]
array. Without the patch, a message that indicates "crash: failed to
read pageflag_names entry" is displayed during session initialization
in Linux 4.6 kernels.
(andrej.skvortzov@gmail.com)
- Fix for Linux commit 0139aa7b7fa12ceef095d99dc36606a5b10ab83a, which
renamed the page._count member to page._refcount. Without the patch,
certain "kmem" commands fail with the "kmem: invalid structure member
offset: page_count".
(anderson@redhat.com)
- Fix for an ARM64 crash-7.1.5 "bt" regression for a task that has
called panic(). Without the patch, the backtrace may fail with a
message such as "bt: WARNING: corrupt prstatus? pstate=0x20000000,
but no user frame found" followed by "bt: WARNING: cannot determine
starting stack frame for task <address>". The pstate register
warning will still be displayed (as it is essentially a kdump bug),
but the backtrace will proceed normally.
(anderson@redhat.com)
- Fix for the ARM64 "bt" command in Linux 4.5 and later kernels which
use per-cpu IRQ stacks. Without the patch, if an active non-crashing
task was running in user space when it received the shutdown IPI from
the crashing task, the "-- <IRQ stack> ---" transition marker from
the IRQ stack to the process stack is not displayed, and a message
indicating "bt: WARNING: arm64_unwind_frame: on IRQ stack: oriq_sp:
<address> fp: 0 (?)" gets displayed.
(anderson@redhat.com)
- Fix for the ARM64 "bt" command in Linux 4.5 and later kernels which
are not configured with CONFIG_FUNCTION_GRAPH_TRACER. Without the
patch, backtraces that originate from a per-cpu IRQ stack will dump
an invalid exception frame before transitioning to the process stack.
(anderson@redhat.com)
- Introduction of ARM64 support for 4K pages with 4-level page tables
and 48 VA bits.
(takahiro.akashi@linaro.org)
- Implemented support for the redesigned ARM64 kernel virtual memory
layout and associated KASLR support that was introduced in Linux 4.6.
The kernel text and static data has been moved from unity-mapped
memory into the vmalloc region, and its start address can be
randomized if CONFIG_RANDOMIZE_BASE is configured. Related support
is being put into the kernel's kdump code, the kexec-tools package,
and makedumpfile(8); with that in place, the analysis of Linux 4.6
ARM64 dumpfiles with or without KASLR enabled should work normally
by entering "crash vmlinux vmcore". On live systems, Linux 4.6 ARM64
kernels will only work automatically if CONFIG_RANDOMIZE_BASE is not
configured. Unfortunately, if CONFIG_RANDOMIZE_BASE is configured
on a live system, two --machdep command line arguments are required,
at least for the time being. The arguments are:
--machdep phys_offset=<base physical address>
--machdep kimage_voffset=<kernel kimage_voffset value>
Without the patch, any attempt to analyze a Linux 4.6 ARM64 kernel
fails during initialization with a stream of "read error" messages
followed by "crash: vmlinux and vmcore do not match!".
(takahiro.akashi@linaro.org)
- Linux 3.15 and later kernels configured with CONFIG_RANDOMIZE_BASE
could be identified because of the "randomize_modules" kernel symbol,
and if it existed, the "--kaslr=<offset>" and/or "--kaslr=auto"
options were unnecessary. Since the "randomize_modules" symbol was
removed in Linux 4.1, this patch has replaced the KASLR identifier
with the "module_load_offset" symbol, which was also introduced in
Linux 3.15, but still remains.
(anderson@redhat.com)
- Improvement of the ARM64 "bt -f" display such that in most cases,
each stack frame level delimiter will be set to the stack address
location containing the old FP and old LR pair.
(takahiro.akashi@linaro.org)
- Fix for the introduction of ARM64 support for 64K pages with 3-level
page tables in crash-7.1.5, which fails to translate user space
virtual addresses. Without the patch, "vtop <user-space address>"
fails to translate all user-space addresses, and any command that
needs to either translate or read user-space memory, such as "vm -p",
"ps -a", and "rd -u" will fail.
(anderson@redhat.com)
- Enhancement of the error message generated by the "tree -t radix"
option when a duplicate entry is encountered. Without the patch,
the error message shows the address of the radix_tree_node that
contains the duplicate entry, for example, "tree: duplicate tree
entry: <radix_tree_node>". It has been changed to also display
the radix_tree_node.slots[] array index and the duplicate entry
value, for example, "tree: duplicate tree entry: radix_tree_node:
<radix_tree_node> slots[<index>]: <entry>".
(anderson@redhat.com)
- Introduction of a new "bt -v" option that checks the kernel stack of
all tasks for evidence of stack overflows. It does so by verifying
the thread_info.task address, ensuring the thread_info.cpu value is
a valid cpu number, and checking the end of the stack for the
STACK_END_MAGIC value.
(anderson@redhat.com)
- Fix to recognize a kernel thread that has user space virtual memory
attached to it. While kernel threads typically do not have an
mm_struct referencing a user-space virtual address space, they can
either temporarily reference one for a user-space copy operation, or
in the case of KVM "vhost" kernel threads, keep a reference to the
user space of the "quem-kvm" task that created them. Without the
patch, they will be mistaken for user tasks; the "bt" command will
display an invalid kernel-entry exception frame that indicates
"[exception RIP: unknown or invalid address]", the "ps" command
will not enclose the command name with brackets, and the "ps -[uk]"
and "foreach [user|kernel]" options will show the kernel thread as
a user task.
(anderson@redhat.com)
- Fix for the "bt -[eE]" options on ARM64 to recognize kernel exception
frames in VHE enabled systems, in which the kernel runs in EL2.
(takahiro.akashi@linaro.org)
- Fix for the extensions/trace.c extension module to account for the
Linux 4.7 kernel commit dcb0b5575d24 that changed the bit index for
the TRACE_EVENT_FL_TRACEPOINT flag. Without the patch, the "extend"
command fails to load the trace.so module, with the error message
"extend: /path/to/crash/extensions/trace.so: no commands registered:
shared object unloaded". The patch reads the flag's enum value
dynamically instead of using a hard-coded value.
(namhyung@gmail.com)
- Incorporated Takahiro Akashi's alternative backtrace method as a
"bt" option, which can be accessed using "bt -o", and where "bt -O"
will toggle the original and optional methods as the default. The
original backtrace method has adopted two changes/features from
the optional method:
(1) ORIG_X0 and SYSCALLNO registers are not displayed in kernel
exception frames.
(2) stackframe entry text locations are modified to be the PC
address of the branch instruction instead of the subsequent
"return" PC address contained in the stackframe link register.
Accordingly, these are the essential differences between the original
and optional methods:
(1) optional: the backtrace will start with the IPI exception frame
located on the process stack.
(2) original: the starting point of backtraces for the active,
non-crashing, tasks, will continue to have crash_save_cpu()
on the IRQ stack as the starting point.
(3) optional: the exception entry stackframe adjusted to be located
farther down in the IRQ stack.
(4) optional: bt -f does not display IRQ stack memory above the
adjusted exception entry stackframe.
(5) optional: may display "(Next exception frame might be wrong)".
(takahiro.akashi@linaro.org, anderson@redhat.com)
- Fix for the failure of the "sym <symbol>" option in the extremely
unlikely case where the symbol's name string is composed entirely of
hexadecimal characters. For example, without the patch, "sym e820"
fails with the error message "sym: invalid address: e820".
(anderson@redhat.com)
- Fix for the failure of the "dis <symbol>" option in the extremely
unlikely case where the symbol's name string is composed entirely of
hexadecimal characters. For example, without the patch, "dis f"
fails with the error message "dis: WARNING: f: no associated kernel
symbol found" followed by "0xf: Cannot access memory at address 0xf".
(anderson@redhat.com)
- Fix for the X86_64 "bt -R <symbol>" option if the only reference
to the kernel text symbol in a backtrace is contained within the
"[exception RIP: <symbol+offset>]" line of an exception frame
dump. Without the patch, the reference will only be picked up if
the exception RIP's hexadecimal address value is used.
(anderson@redhat.com)
- Fix for the ARM64 "bt -R <symbol>" option if the only reference
to the kernel text symbol in a backtrace is contained within the
"[PC: <address> [<symbol+offset>]" line of an exception frame
dump. Without the patch, the reference will only be picked up if
the PC's hexadecimal address value is used.
(anderson@redhat.com)
- Fix for the gathering of module symbol name strings during session
initialization. In the unlikely case where the ordering of module
symbol name strings does not match the order of the kernel_symbol
structures, a faulty module symbol list entry may be created that
contains a bogus name string.
(sebastien.piechurski@bull.net)
- Fix the PERCENTAGE of total output of the "kmem -i" SWAP USED line
when the system has no swap pages at all. Without the patch, both
the PAGES and TOTAL columns show values of zero, but it confusingly
shows "100% of TOTAL SWAP", which upon first glance may seem to
indicate potential memory pressure.
(jsiddle@redhat.com)
- Enhancement to determine structure member data if the member is
contained within an anonymous structure or union. Without the patch,
it is necessary to parse the output of a discrete gdb "printf"
command to determine the offset of such a structure member.
(Alexandr_Terekhov@epam.com)
- Speed up session initialization by attempting MEMBER_OFFSET_INIT()
before falling back to ANON_MEMBER_OFFSET_INIT() in several known
cases of structure members that are contained within anonymous
structures.
(anderson@redhat.com)
- Implemented new "list -S" and "tree -S" options that are similar to
each command's -s option, but instead of parsing gdb output, member
values are read directly from memory, so the command is much faster
for 1-, 2-, 4-, and 8-byte members.
(Alexandr_Terekhov@epam.com)
- Fix to recognize and support x86_64 Linux 4.8-rc1 and later kernels
that are configured with CONFIG_RANDOMIZE_MEMORY, which randomizes
the base addresses of the kernel's unity-map address (PAGE_OFFSET),
and the vmalloc region. Without the patch, the crash utility fails
with a segmentation violation during session initialization on a
live system, or will generate a number of WARNING messages followed
by the fatal error message "crash: vmlinux and <dumpfile name> do
not match!" with dumpfiles.
(anderson@redhat.com)
- Fix for Linux 4.1 commit d0a0de21f82bbc1737ea3c831f018d0c2bc6b9c2,
which renamed the x86_64 "init_tss" per-cpu variable to "cpu_tss".
Without the patch, the addresses of the 4 per-cpu exception stacks
cannot be determined, which causes backtraces that originate on
any of the per-cpu DOUBLEFAULT, NMI, DEBUG, or MCE stacks to be
truncated.
(anderson@redhat.com)
- With the introduction of radix MMU in Power ISA 3.0, there are
changes in kernel page table management accommodating it. This patch
series makes appropriate changes here to work for such kernels.
Also, this series fixes a few bugs along the way:
ppc64: fix vtop page translation for 4K pages
ppc64: Use kernel terminology for each level in 4-level page table
ppc64/book3s: address changes in kernel v4.5
ppc64/book3s: address change in page flags for PowerISA v3.0
ppc64: use physical addresses and unfold pud for 64K page size
ppc64/book3s: support big endian Linux page tables
The patches are needed for Linux v4.5 and later kernels on all
ppc64 hardware.
(hbathini@linux.vnet.ibm.com)
- Fix for Linux 4.8-rc1 commit 500462a9de657f86edaa102f8ab6bff7f7e43fc2,
in which Thomas Gleixner redesigned the kernel timer mechanism to
switch to a non-cascading wheel. Without the patch, the "timer"
command fails with the message "timer: zero-size memory allocation!
(called from <address>)"
(anderson@redhat.com)
- Support for PPC64/BOOK3S virtual address translation for radix MMU.
As both radix and hash MMU are supported in a single kernel on
Power ISA 3.0 based server processors, identify the current MMU
type and set page table index values accordingly. Also, in Linux
4.7 and later kernels, PPC64/BOOK3S uses the same masked bit values
in page table entries for 4K and 64K page sizes.
(hbathini@linux.vnet.ibm.com)
- Change the RESIZEBUF() macro so that it will accept buffer pointers
that are not declared as "char *" types. Change two prior direct
callers of resizebuf() to use RESIZEBUF(), and fix two prior users of
RESIZEBUF() to correctly calculate the need to resize their buffers.
(anderson@redhat.com)
- Fix for the "trace.so" extension module to properly recognize Linux
3.15 and later kernels. In crash-7.1.6, the MEMBER_OFFSET() macro
has been improved so that it is able to recognize members of embedded
anonymous structures. However, the module's manner of recognizing
Linux 3.15 and later kernels depended upon MEMBER_OFFSET() failing
to handle anonymous members, and therefore the improvement prevented
the module from successfully loading.
(rabinv@axis.com)
- If a "struct" command address argument is expressed using the per-cpu
"symbol:cpuspec" format, and the symbol is a pointer type, i.e., not
the address of the structure, display a WARNING message.
(atomlin@redhat.com)
- Exclude ARM64 kernel module linker mapping symbols like "$d" and "$x"
as is done with 32-bit ARM. Without the patch, a crash session may
fail during the "gathering module symbol data" stage with a message
similar to "crash: store_module_symbols_v2: total: 15 mcnt: 16".
(takahiro.akashi@linaro.org)
- Enhancement to the ARM64 "dis" command when the kernel has enabled
KASLR. When KASLR is enabled on ARM64, a function call between a
module and the base kernel code will be done via a veneer (PLT) if
the displacement is more than +/-128MB. As a result, disassembled
code will show a branch to the in-module veneer location instead of
the in-kernel target location. To avoid confusion, the output of
the "dis" command will translate the veneer location to the target
location preceded by "plt:", for example, "<plt:printk>".
(takahiro.akashi@linaro.org)
- Improvement of the "dev -d" option to display I/O statics for disks
whose device driver uses the blk-mq interface. Currently "dev -d"
always displays 0 in all fields for the blk-mq disk because blk-mq
does not increment/decrement request_list.count[2] on I/O creation
and I/O completion. The following values are used in blk-mq in such
situations:
- I/O creation: blk_mq_ctx.rq_dispatched[2]
- I/O completion: blk_mq_ctx.rq_completed[2]
So, we can get the counter of in-progress I/Os as follows:
in progress I/Os == rq_dispatched - rq_completed
This patch displays the result of above calculation for the disk.
It determines whether the device driver uses blk-mq if the
request_queue.mq_ops is not NULL. The "DRV" field is displayed as
"N/A(MQ)" if the value for in-flight in the device driver does not
exist for blk-mq.
(m.mizuma@jp.fujitsu.com)
(10/13/16)
7.1.5-2.fc25
- Fedora Rawhide build: crash-7.1.5-2.fc25 (05/05/16)
http://koji.fedoraproject.org/koji/buildinfo?buildID=760294
7.1.5-1.fc25
- Fedora Rawhide build: crash-7.1.5-1.fc25 (04/28/16)
http://koji.fedoraproject.org/koji/buildinfo?buildID=758419
7.1.5 - Fix for the handling of Xen DomU ELF dumpfiles to prevent the
pre-gathering of p2m frames during session initialization, which
is unnecessary since ELF files contain the mapping information in
their ".xen_p2m" section. Without the patch, it is possible that the
crash session may be unnecessarily aborted if the p2m frame-gathering
fails, for example, if the CR3 value in the header is invalid.
(ptesarik@suse.com)
- Fix for the translation of X86_64 virtual addresses in the vsyscall
region between 0xffffffffff600000 and 0xffffffffffe00000. Without
the patch, the reading of addresses in that region returns invalid
data; in addition, the "vtop" command for an address in that region
shows an invalid physical address under the "PHYSICAL" column.
(nakajima.akira@nttcom.co.jp, anderson@redhat.com)
- Make the "zero excluded" mode default behavior when analyzing SADUMP
dumpfiles because some Fujitsu troubleshooting software assumes the
behavior. Also, fix the "set -v" option to show the "zero_excluded"
internal variable as "on" if it has been set when analyzing SADUMP
dumpfiles.
(d.hatayama@jp.fujitsu.com)
- Fix for the "bt" command to properly pull the stack and frame pointer
registers from the NT_PRSTATUS notes of 32-bit tasks running in
user-mode on ARM64. Without the patch, the "bt" command utilizes
ptregs->sp and ptregs->regs[29] for 32-bit tasks instead of the
architecturally-mapped ptregs->regs[13] and ptregs->regs[11], which
yields unpredictable/invalid results, and possibly a segmentation
violation.
(drjones@redhat.com)
- Fix for the "ps -t" option in 3.17 and later kernels that contain
commit ccbf62d8a284cf181ac28c8e8407dd077d90dd4b, which changed the
task_struct.start_time member from a struct timespec to a u64.
Without the patch, the "RUN TIME" value is nonsensical.
(anderson@redhat.com)
- Fix for the changes made to the kernel module structure introduced by
this kernel commit for Linux 4.5 and later kernels:
commit 7523e4dc5057e157212b4741abd6256e03404cf1
module: use a structure to encapsulate layout.
Without the patch, the crash session fails during initialization
with the error message: "crash: invalid structure member offset:
module_core_size".
(sebott@linux.vnet.ibm.com)
- Fix for the changes made to the kernel module structure introduced by
this kernel commit for Linux 4.5 and later kernels:
commit 7523e4dc5057e157212b4741abd6256e03404cf1
module: use a structure to encapsulate layout.
Without the patch, the crash session fails during initialization
with the error message: "crash: invalid structure member offset:
module_core_size".
(sebott@linux.vnet.ibm.com)
- The crash utility has not supported Xen dom0 and domU dumpfiles since
this Linux 3.19 commit:
commit 054954eb051f35e74b75a566a96fe756015352c8
xen: switch to linear virtual mapped sparse p2m list
This patch resurrects support for dom0 dumpfiles only. Without the
patch, the crash session fails during session initialization with the
message "crash: cannot resolve p2m_top".
(daniel.kiper@oracle.com)
- Fix for the replacements made to the kernel's cpu_possible_mask,
cpu_online_mask, cpu_present_mask and cpu_active_mask symbols in
this kernel commit for Linux 4.5 and later kernels:
commit 5aec01b834fd6f8ca49d1aeede665b950d0c148e
kernel/cpu.c: eliminate cpu_*_mask
Without the patch, behavior is architecture-specific, dependent upon
whether the cpu mask values are used to calculate the number of cpus.
For example, ARM64 crash sessions fail during session initialization
with the error message "crash: zero-size memory allocation! (called
from <address>)", whereas X86_64 sessions come up normally, but
invalid cpu mask values of zero are stored internally.
(anderson@redhat.com)
- Fixes for "[-Werror=misleading-indentation]" compiler warnings that
are generated by the following files, when building X86_64 in a
Fedora Rawhide environment with gcc-6.0.0:
gdb-7.6/bfd/coff-i386.c
gdb-7.6/bfd/coff-x86_64.c
kernel.c
x86_64.c
lkcd_common.c
Without the patch, the warnings in the bfd library files are treated
as errors, and abort the build. The three instances in the top-level
crash source code directory are non-fatal. There are several other
gdb-specific instances that are non-fatal and are not addressed.
(anderson@redhat.com)
- Fix for a "[-Werror=shift-negative-value]" compiler warning that is
generated by "gdb-7.6/opcodes/arm-dis.c" when building crash with
"make target=ARM64" on an x86_64 host with gcc-6.0.0. Without the
patch, the warning is treated as an error and the build is aborted.
(anderson@redhat.com)
- Fix for a series of "[-Werror=shift-negative-value]" compiler
warnings that are generated by "gdb-7.6/bfd/elf64-ppc.c" and
"gdb-7.6/opcodes/ppc-opc.c" when building with "make target=PPC64"
on an x86_64 host with gcc-6.0.0. Without the patch, the warnings
are treated as errors and the build is aborted.
(anderson@redhat.com)
- Fix for a "[-Werror=unused-const-variable]" compiler warning that
is generated by "gdb-7.6/opcodes/mips-dis.c" when building with
"make target=MIPS" on an x86_64 host with gcc-6.0.0. Without the
patch, the warning is treated as an error and the build is aborted.
(anderson@redhat.com)
- Configure the embedded gdb module with "--disable-sim" in order to
bypass the unnecessary build of the libsim.a library.
(anderson@redhat.com)
- Implement support for per-cpu IRQ stacks on the ARM64 architecture,
which were introduced in Linux 4.5 by this commit:
commit 132cd887b5c54758d04bf25c52fa48f45e843a30
arm64: Modify stack trace and dump for use with irq_stack
Without the patch, if an active task was operating on its per-cpu
IRQ stack on dumpfiles generated by kdump, its backtrace would start
at the exception frame that was laid down on the process stack.
This patch also adds support for "bt -E" to search IRQ stacks for
exception frames, and the "mach" command displays the addresses
of each per-cpu IRQ stack.
(anderson@redhat.com)
- Fixes for "[-Werror=misleading-indentation]" compiler warnings that
are generated by the following files, when building X86_64 in a
Fedora Rawhide environment with gcc-6.0.0:
gdb-7.6/gdb/ada-lang.c
gdb-7.6/gdb/linux-record.c
gdb-7.6/gdb/inflow.c
gdb-7.6/gdb/printcmd.c
gdb-7.6/gdb/c-typeprint.c
Without the patch, warnings in the gdb-7.6/gdb directory are not
treated as errors, and are non-fatal to the build.
(anderson@redhat.com)
- Further fix for the symbol name changes made to the kernel's
cpu_online_mask, cpu_possible_mask, cpu_present_mask and
cpu_active_mask symbols in Linux 4.5 and later kernels for when
the crash session is brought up with "crash -d<debug-level>".
Without the patch, the cpus found in each mask are displayed like
this example:
cpu_possible_(null): cpus: 0 1 2 3 4 5 6 7
cpu_present_(null): cpus: 0 1
cpu_online_(null): cpus: 0 1
cpu_active_(null): cpus: 0 1
The "(null)" string segments above should read "mask".
(anderson@redhat.com)
- Fix for the changes made to the kernel module structure introduced by
this kernel commit for Linux 4.5 and later kernels:
commit 8244062ef1e54502ef55f54cced659913f244c3e
modules: fix longstanding /proc/kallsyms vs module insertion race.
Without the patch, the crash session fails during initialization
with the error message: "crash: invalid structure member offset:
module_num_symtab".
(anderson@redhat.com)
- Fix for the "dis <function | address>" option if the function or
address is the highest text symbol value in a kernel module. Without
the patch, the disassembly may continue past the end of the function,
or may show nothing at all. The patch utilizes in-kernel kallsyms
symbol size information instead of disassembling until reaching the
address of the next symbol in the module.
(anderson@redhat.com)
- Fix for the "irq -s" option in Linux 4.2 and later kernels. Without
the patch, the irq_chip.name string (e.g. "IO-APIC", "PCI-MSI", etc.)
is missing from the display.
(rabin.vincent@axis.com)
- Improvement of the accuracy of the allocated objects count for each
kmem_cache shown by "kmem -s" in kernels configured with CONFIG_SLUB.
Without the patch, the values under the ALLOCATED column may be too
large because cached per-cpu objects are counted as allocated.
(vinayakm.list@gmail.com)
- Fixes to address two gcc-4.1.2 compiler warnings introduced by the
previous patch:
memory.c: In function ‘count_cpu_partial’:
memory.c:17958: warning: comparison is always false due to limited
range of data type
memory.c: In function ‘count_partial’:
memory.c:18729: warning: comparison is always false due to limited
range of data type
(anderson@redhat.com)
- Introduction of the "whatis -r" and "whatis -m" options. The -r
option searches for data structures of a specified size or within a
range of specified sizes. The -m option searches for data structures
that contain a member of a given type. If a structure contains
another structure, the members of the embedded structure will also
be subject to the search. The type string may be a substring of the
data type name. The output displays the size and name of the data
structure.
(Alexandr_Terekhov@epam.com, anderson@redhat.com)
- Apply a fuzz factor of zero to the re-application of a modified
version of the gdb-7.6.patch in a pre-existing build directory.
Without the patch, it is possible that a previously-applied patch
could be applied a second time without the fuzz restriction.
(anderson@redhat.com)
- Include sys/macros.h explicitly in filesys.c for the definitions
of major(), minor() and makedev(). These functions are defined
in the sys/sysmacros.h header, not sys/types.h. Linux C libraries
are updating to drop the implicit include, so we need to include
it explicitly.
(vapier@gentoo.org)
- Fix for "kmem -[sS]" options for kernels configured with CONFIG_SLUB.
Without the patch, the count displayed in the ALLOCATED column may
be too large, and the "kmem -S" display of allocated/free status of
individual objects may be incorrect.
(hirofumi@mail.parknet.co.jp)
- Fix for "kmem -[sS]" options for kernels configured with CONFIG_SLUB.
Without the patch, if a freelist pointer is corrupt, the address of
the slab page being referenced may not be displayed by the error
message, showing something like: "kmem: kmalloc-32: slab: 0 invalid
freepointer: 6e652f323a302d74".
(hirofumi@mail.parknet.co.jp)
- Fix for the "vm -p" option on kernels that are not configured with
CONFIG_SWAP. Without the patch, the command may fail prematurely
with the message "nr_swapfiles doesn't exist in this kernel".
(rabinv@axis.com)
- Introduction of ARM64 support for 64K pages with 3-level page tables
and 48 VA bits. Until now, support has only existed for 64K pages
with 2-level page tables, and 4K pages with 3-level page tables.
(jim.hull@hpe.com)
- Fix for the "vm -p" and "vtop <user virtual address>" commands if
a user page is swapped out. Without the patch, the "/dev" component
of the swap file pathname may be missing from its display.
(anderson@redhat.com)
- Fix for the x86_64 "vm -p" command to properly emulate the kernel's
pte_present() function, which checks for either _PAGE_PRESENT or
_PAGE_PROTNONE to be set. Without the patch, user pages whose PTE
does not have _PAGE_PRESENT bit set are misconstrued as SWAP pages
with an "(unknown swap location") along with a bogus OFFSET value.
(anderson@redhat.com)
- When reading a task's task_struct.flags field, check for its size,
which was changed from an unsigned long to an unsigned int.
(dave.kleikamp@oracle.com)
- Introduction of support for the 64-bit SPARC V9 architecture. This
version supports running against a live kernel. Compressed kdump
support is also here, but the crash dump support for the kernel,
kexec-tools, and makedumpfile is still pending. Initial work was
done by Karl Volz with help from Bob Picco.
(dave.kleikamp@oracle.com)
- Account for the Linux 3.17 increase of the ARM64 MAX_PHYSMEM_BITS
definition from 40 to 48.
(Johan.Erlandsson.sonymobile.com)
(04/27/16)
7.1.4-1.fc24
- Fedora Rawhide build: crash-7.1.4-1.fc24 (12/17/15)
http://koji.fedoraproject.org/koji/buildinfo?buildID=706240
7.1.4 - Fix for the ARM64 "vtop" command when translating kernel virtual
addresses within a 2MB or 512MB huge page in which the PGD or PMD
contains software-defined PTE bits. Without the patch, the "PAGE:"
address value will show the software-defined bits, the command will
not display the related page structure translation, and will end with
the message "WARNING: sparsemem: invalid section number: <number>".
(Johan.Erlandsson@sonymobile.com, anderson@redhat.com)
- Fix for the X86_64 "bt" command in Linux 4.2 and later kernels
that are configured with both CONFIG_HAVE_COPY_THREAD_TLS and
CONFIG_FRAME_POINTER. Without the patch, the fact that the kernel
was compiled with framepointers is not recognized, which may result
in backtraces containing stale frame references.
(anderson@redhat.com)
- Fix for the "dis" command to support three new x86 instruction
extensions that have been added to the Intel instruction set for
hardware platforms that support them. The newly-added instructions
"clflushopt", "clwb", and "pcommit" prepend 0x66 as a prefix byte to
the "clflush", "xsaveopt" and "sfence" instructions respectively.
Without the patch:
"clflushopt" is disassembled as: "data16" followed by "clflush"
"clwb" is disassembled as: "data16" followed by "xsaveopt"
"pcommit" is disassembled as: "data16" followed by "sfence"
The "clflushopt" instruction was introduced in Linux 3.15 in the
clflushopt() function. The "clwb" and "pcommit" instructions were
introduced in Linux 4.1 in the clwb() and pcommit_sfence() functions.
(anderson@redhat.com)
- Fix for the extensions/trace.c extension module for Linux 4.2 and
later kernels. Without the patch, the module fails to load, with
the message "failed to init the offset, struct:ftrace_event_call,
member:list".
(anderson@redhat.com)
- For many years, Xen Dom0 dumps could only be saved in ELF format.
Since makedumpfile commit 349a0ed1, it is now possible to save Xen
dumps in compressed kdump format. This patch set adds support for
these files. Two new files, xen_dom0.c and xen_dom0.h, have been
added to provide the common functionality required by both ELF and
compressed kdump formats.
(ptesarik@suse.cz)
- Since Linux v4.1, specifically, "MIPS: Rearrange PTE bits into fixed
positions.", commit be0c37c985eddc46d0d67543898c086f60460e2e, the
MIPS PTE bits are at fixed locations. Since they are macros in the
kernel, this patch adds an explicit kernel version check in order to
determine and set their values.
(rabinv@axis.com)
- Display a machine-type mismatch warning if a little-endian PPC64
compressed kdump created by makedumpfile(8) is used as an argument
with a non-PPC64 crash utility binary. Without the patch, the
dumpfile is accepted, and the session subsequently fails with a
message indicating that that the vmlinux and dumpfile do not match.
(anderson@redhat.com)
- Fix for bitmap-handling in SADUMP dumpfiles, which associate each bit
in a bitmap with a physical page in the reverse order that is used
in kdump-compressed format. The bug had not been detected for a long
time because bitmaps in SADUMP formats consist mostly of 0x00 and
0xff excluding a very limited amount of memory space for firmware.
(indou.takao@jp.fujitsu.com, d.hatayama@jp.fujitsu.com)
- Fix for the behavior of the --zero_excluded option when used with
SADUMP dumpfiles. Without the patch, the behavior of --zero_excluded
option is the opposite to what is expected: reads of filtered pages
return successfully with zero-filled memory, while reads of filtered
filtered pages fail when --zero_excluded option has been specified.
(d.hatayama@jp.fujitsu.com)
- Fix for the "kmem -i" command in Linux 2.6.27 and later kernels to
prevent the possibility that an arbitrary address may be accessed
when calculating the number of total huge pages. Without the patch,
the command's "COMMIT LIMIT" and "COMMITTED" values may be invalid.
(atomlin@redhat.com)
- Added recognition of the new DUMP_DH_EXCLUDED_VMEMMAP flag in the
header of compressed kdumps, which is set by the new -e option to
the makedumpfile(8) facility. The -e option excludes kernel pages
that contain nothing but kernel page structures for pages that are
not being included in the dump. If the bit is set in the dumpfile,
the crash utility will issue a warning that the dumpfile is known to
be incomplete during initialization, just prior to the system banner
display.
(anderson@redhat.com)
- Fix for the handling of compound pages in Linux 4.4 and later kernels,
which contain this kernel commit:
commit 1d798ca3f16437c71ff63e36597ff07f9c12e4d6
mm: make compound_head() robust
The commit above removes the PG_tail and PG_compound page.flags bits
and the page.first_page member, and introduces a page.compound_head
member, which is a pointer to the head page and whose bit 0 acts as
the tail flag. Without the patch, a SLAB or SLUB warning message
that indicates "cannot determine how compound pages are linked" is
displayed during initialization, and any command that tracks compound
pages will be affected.
(anderson@redhat.com)
- Fix for the handling of dynamically-sized task_struct structures in
Linux 4.2 and later kernels, which contain these commits:
commit 5aaeb5c01c5b6c0be7b7aadbf3ace9f3a4458c3d
x86/fpu, sched: Introduce CONFIG_ARCH_WANTS_DYNAMIC_TASK_STRUCT and
use it on x86
commit 0c8c0f03e3a292e031596484275c14cf39c0ab7a
x86/fpu, sched: Dynamically allocate 'struct fpu'
Without the patch, when running on a filtered kdump dumpfile, it is
possible that error messages like this will be seen when gathering
the tasks running on a system: "crash: page excluded: kernel virtual
address: <task_struct address> type: "fill_task_struct".
(ats-kumagai@wm.jp.nec.com)
- Fix for the "kmem -s <address>" command in Linux 3.13 and later
kernels configured with CONFIG_SLAB. Without the patch, if the
address argument is contained within an object in a tail page of a
multi-page slab, the command fails with the message "kmem: address
is not allocated in slab subsystem: <address>". Furthermore, in
Linux 4.4 and later kernels configured with CONFIG_SLAB, addresses
that are contained within an object in a tail page of a multi-page
slab will not be marked by their slab cache name by the "rd -S" and
"bt -F" commands.
(anderson@redhat.com)
- Fix for a segmentation violation when attempting to run live on a
a system without the crash.ko memory driver, and whose kernel was
configured with CONFIG_STRICT_DEVMEM. Without the patch, if any
-d<value> is entered on the command line, the crash session fails
during initialization.
(dmair@suse.com)
- Update for the determination of the ARM64 page size for kernels
containing this Linux 4.4 commit:
commit 9d372c9fab34cd8803141871195141995f85c7f7
arm64: Add page size to the kernel image header
Without the patch, the kernel page size is calculated by looking
at the size of the "swapper_pg_dir" page directory. With this
update, the page size can be determined by checking a flag built
into the kernel image header, available in the "_kernel_flags_le"
absolute symbol.
(drjones@redhat.com)
- Fix for the handling of ARM and ARM64 QEMU-generated ELF dumpfiles
and compressed kdump clones. The patch utilizes the NT_PRSTATUS
notes in the dumpfile headers instead of reading them from the
kernel's "crash_notes", which are not initialized when QEMU generates
a dumpfile. Without the patch, these warning messages are displayed
during session initialization:
WARNING: invalid note (n_type != NT_PRSTATUS)
WARNING: cannot retrieve registers for active tasks
and running "bt" on an active task causes a segmentation violation.
(drjones@redhat.com)
- Update to the previous QEMU-specific patch to handle kdump dumpfiles
which have offline cpus, and therefore will not contain associated
NT_PRSTATUS notes in the dumpfile header. Without the patch, if
there are any offline cpus, a segmentation violation is generated
during session initialization.
(anderson@redhat.com)
- The s390 stand-alone dump tools may write the kernel memory directly
to a block device. When running the crash utility against such a
block device, a misleading warning message such as this is displayed:
WARNING: /dev/sda1: may be truncated or incomplete
PT_LOAD p_offset: 16384
p_filesz: 5497558138880
bytes required: 5497558155264
dumpfile size: 0
With the patch, the warning message above will be replaced by a note
using this format:
NOTE: /dev/sda1: No dump complete check for block devices
(holzheu@linux.vnet.ibm.com)
- Map CTRL-l to clear the screen while in vi insertion mode. Without
the patch, it displays "^L".
(kwalker@redhat.com)
- Introduced a general-purpose handler to register data structures that
the kernel has dynamically downsized from the size indicated by the
debuginfo data. At this time, only "kmem_cache" and "task_struct"
structures that have been downsized are registered, but others may be
added in the future. If a downsized data structure is passed to gdb
for display, gdb will request a read of the "full" data structure,
which may flow into a memory region that was either filtered by
makedumpfile(8), or perhaps into non-existent memory, thereby killing
the generating command immediately due to a partial read. With this
patch, commands such as "struct" and "task" that reference downsized
data structures will have their reads flagged to return successfully
if partial read error occurs.
(anderson@redhat.com)
- Fix for Linux 3.18 and later 32-bit ARM kernels that are configured
with CONFIG_SLAB which contain percpu array_cache structures that
were allocated with vmalloc(). Without the patch, during session
initialization there will be error messages that indicate "crash:
kmem_cache: <vaddr>: invalid array_cache pointer: <vaddr>", and
during runtime, the "kmem -[sS]" commands will show kmem_cache lines
that marked as "[INVALID/CORRUPTED]".
(anderson@redhat.com)
- Added a new "list -l <offset>" option that can only be used in
conjunction with "-s", and requires that the "start" address is the
address of a list_head, or other similar list linkage structure whose
first member points to the next linkage structure. The "-l <offset>"
argument is the offset of the embedded list linkage structure in the
specified "-s" data structure; it can be either a number of bytes or
expressed in "struct.member" format.
(anderson@redhat.com)
- Enhanced the debug-only display of the first kernel data items read
during session initialization. This includes the system's cpu maps,
xtime, and utsname data. These require at least "-d1" as a command
line option value, and are primarily useful as an aide for debugging
suspect dumpfiles that fail during session initialization.
(anderson@redhat.com)
- Added "print_array" as a new internal variable that may be turned
on/off with the "set" command. When set to "on", gdb's printing of
arrays will be set to "pretty", so that the display of each array
element will consume one line.
(anderson@redhat.com)
- Introduction of the "sys -i" option, which displays the kernel's DMI
identification string data if available.
(atomlin@redhat.com, anderson@redhat.com)
- Fix for "crash --osrelease" on Xen kernels that have both VMCOREINFO
and VMCOREINFO_XEN ELF notes. Without the patch, the command returns
"(unknown)".
(anderson@redhat.com, dietmar.hahn@ts.fujitsu.com)
- Fix for the crash.ko memory driver sample to handle situation where
legitimate pages of RAM that pass the page_is_ram() and pfn_valid()
verifier functions may not be provided by the s390x hypervisor, and
the subsequent memcpy() to the bounce buffer generates an addressing
exception BUG. The patch utilizes probe_kernel_read() instead of
memcpy() to properly handle the copy failure and prevent the BUG.
(anderson@redhat.com)
(12/16/15)
7.1.3-1.fc23
- Fedora Rawhide build: crash-7.1.3-1.fc24 (09/03/15)
http://koji.fedoraproject.org/koji/buildinfo?buildID=682708
7.1.3 - Fix for the "crash --osrelease" option for flattened format dumpfiles
in the unlikely event that the dumpfile header does not contain the
VMCOREINFO note section from the original ELF /proc/vmcore. Without
the patch, the command displays nothing instead of showing "unknown".
(anderson@redhat.com)
- Fix for the "kmem -s <address>", "bt -F[F]", and "rd -S[S]"
options in kernels configured with CONFIG_SLUB. Without the patch,
if a referenced slab object address comes from a slab cache that
utilizes a multiple-page slab, and the object is located within
a tail page of that slab cache, it will not be recognized as a slab
object. The "bt -F[F]" and "rd -S[S]" options will just show the
object address, and the "kmem -s <address>" object will indicate
"kmem: address is not allocated in slab subsystem: <address>".
This bug is a regression that was introduced in crash-7.1.0 by commit
8b2cb365d7fb139e77cedd80d4061332099ed382, which addressed a bug where
stale slab object addresses were incorrectly being recognized as
valid slab objects.
(anderson@redhat.com)
- Fix for a segmentation violation generated by the ARM64 "bt -[f|F]"
options when analyzing the active tasks in vmcores generated by the
kdump facility. This bug is a regression that was introduced in
crash-7.1.2 by commit 15a58e4070486efa2aa965bdd636626e62b65cc7, which
was an enhancement of the ARM64 backtrace capability for active tasks
in kdump vmcores.
(anderson@redhat.com)
- Fix for the extensions/trace.c extension module to account for
kernels that are not configured with CONFIG_TRACE_MAX_TRACER.
Without the patch, the module fails to load with the error message
"failed to init the offset, struct: trace_array, member: max_offset".
(rabinv@axis.com)
- If a kdump dumpfile is marked as incomplete in its ELF or compressed
kdump header, and the user has not used the --zero_excluded command
line option, append a note to the incomplete dump WARNING message
shown during invocation that suggests the use of --zero_excluded.
(zhouwj-fnst@cn.fujitsu.com)
- Fix for the RSS value displayed by the "ps" command in Linux 2.6.34
and later big-endian machines. Without the patch, a task's RSS value
will be erroneously calculated by using twice its file pages instead
of adding its file pages with its anonymous pages.
(anderson@redhat.com)
- Do not search for a panic task in s390x dumpfiles that are marked as
a "live dump" by the "zgetdump" facility. Without the patch, an
exhaustive, unnecessary, search of all kernel stacks that looks for
evidence of a system crash may find an invalid reference in a task's
kernel stack due to the common zero-based user and kernel virtual
address space ranges of the s390x, causing the task to be mistakenly
set as the "PANIC" task.
(holzheu@linux.vnet.ibm.com, anderson@redhat.com)
- Mark the "crash" task that generated a snapshot vmcore utilizing the
the "snap.so" extension module as "(ACTIVE)" in the STATE field of
the initial system banner and the "set" command. Without the patch,
the task's STATE field shows it as the "(PANIC)" task.
(anderson@redhat.com)
- Second part of:
Do not search for a panic task in s390x dumpfiles that are marked
as a "live dump" by the "zgetdump" facility...
The first part prevented a search of the active tasks; this part
prevents the last-ditch search of all tasks.
(anderson@redhat.com)
- When searching all kernel stacks for evidence of a panic task in
"live" s390x dumpfiles created by the VMDUMP, stand-alone dump, or
"virsh dump" facilities, none of which explicitly mark the dumpfile
as a "live dump", run a standard "bt" backtrace on each kernel stack
instead of the text-address-only "bt -t". Without the patch, an
invalid text reference may be found in a task's kernel stack due to
the common zero-based user and kernel virtual address space ranges of
the s390x, causing the task to be mistakenly set as the "PANIC" task.
(holzheu@linux.vnet.ibm.com)
- Introduction of the "dis -f <address>" option, which disassembles
from the target address until the end of the function.
(atomlin@redhat.com)
- Fix for the ARM64 "dis" command to prevent branch target addresses
from being displayed as kernel system call alias/wrapper names, for
example, "SyS_read+<offset>" instead of "sys_read+<offset>".
(anderson@redhat.com)
- Fix for the PPC64 "dis" command to prevent conditional branch
target addresses from being displayed as kernel system call
alias/wrapper names, for example, "SyS_read+<offset>" instead
of "sys_read+<offset>".
(anderson@redhat.com)
- Fix for the S390X "dis" command to prevent jump target addresses
from being displayed as kernel system call alias/wrapper names, for
example, "SyS_read+<offset>" instead of "sys_read+<offset>".
(anderson@redhat.com)
- Fix for the "dis" command on architectures with variable-length
instructions. Without the patch, "dis [-f] <function>" may continue
beyond the end of a function, disassembling the memory that is in
between the target function and the next function. For kernel module
functions, the module's debuginfo data must be loaded.
(anderson@redhat.com)
- Minor cleanup and error handling fix-up for the "dis" command.
Without the patch, if the target address of "dis -r" or "dis -f"
is not an exact address of an instruction, "dis -r" will continue
beyond the target address, and "dis -f" will show nothing.
(anderson@redhat.com)
- Reduce the unnecessary error messages if a directory is used as a
command line argument. Without the patch, six error messages are
displayed:
crash: unable to read dump file /tmp
/tmp: ELF header read: Is a directory
/tmp: ELF header read: Is a directory
crash: /tmp: read: Is a directory
read_maps: unable to read header from /tmp, errno = 1
crash: vmw: Failed to read '/tmp': [Error 21] Is a directory
With the patch applied, the functions that generate those messages
are not called; only the standard "not a supported file format",
and "Usage" messages will be displayed.
(anderson@redhat.com)
- If the method of determining how compound pages are linked cannot be
accomplished due to page struct related changes in upstream kernels,
issue a WARNING message during session initialization.
(anderson@redhat.com)
- Fix for the "timer" command on Linux 4.2 and later kernels, which
contain this kernel commit that modifies the tvec_root and tvec
data structures:
commit bc7a34b8b9ebfb0f4b8a35a72a0b134fd6c5ef50
timer: Use hlist for the timer wheel hash buckets
Without the patch, the "timer" command will spew messages indicating
"timer: invalid list entry: 0", followed by "timer: ignoring faulty
timer list at index <number> of timer array".
(anderson@redhat.com)
- Introduction of the "dis -s <address>" option, which displays the
filename and line number that is associated with the specified text
location, followed by a source code listing if it is available on the
host machine. The line associated with the text location will be
marked with an asterisk; depending upon gdb's internal "listsize"
variable, several lines will precede the marked location. If a
"count" argument is entered, it specifies the number of source code
lines to be displayed after the marked location; otherwise the
remaining source code of the containing function will be displayed.
(anderson@redhat.com)
- Added a new "--src <directory>" command line option for use by the
"dis -s" option if the kernel source code is not located in the
standard location that is compiled into the kernel's debuginfo data.
The directory argument should point to the top-level directory of the
kernel source tree.
(anderson@redhat.com)
(09/03/15)
7.1.2-1.fc23
- Fedora Rawhide build: crash-7.1.2-1.fc23 (07/13/15)
http://koji.fedoraproject.org/koji/buildinfo?buildID=668508
7.1.2 - Enhancement of the ARM64 backtrace capability. Without the patch,
backtraces of the active tasks start at the function that is saved
in each per-cpu ELF note. With the patch, the backtrace will start
at the "crash_kexec" function on the panicking cpu, and at the
"crash_save_cpu" function on the other active cpus. By doing so,
the backtrace will display the exception handling functions leading
to crash_kexec() or crash_save_cpu(), as well as the exception frame
register set as it was at the time of the fatal exception on the
panic cpu, or when the shutdown IPI was received on the other cpus.
(anderson@redhat.com)
- Enabled the "bt -R" option on the ARM64 architecture. Without the
patch, the option fails with the message "bt: -R option not supported
or applicable on this architecture or kernel".
(anderson@redhat.com)
- Enabled the "crash --log vmcore" command line option on the ARM64
architecture. Without the patch, the option fails with the message
"crash: crash --log not implemented on ARM64: TBD".
(anderson@redhat.com)
- Fix for the S390X "bt" command when running against kernels that have
Linux 4.0 commit 2f859d0dad818765117c1cecb24b3bc7f4592074, which
removes the "async_stack" and "panic_stack" members from the "pcpu"
structure. Without the patch, backtraces of active tasks that were
executing I/O or machine check interrupts are not displayed, while
other tasks may generate fatal readmem() errors of type "readmem_ul".
(holzheu@linux.vnet.ibm.com)
- Fix to prevent an unnecessary/temporary GETBUF() memory allocation
of 1 MB by the dump_mem_map() utility function when the kernel is
configured with CONFIG_SPARSEMEM.
(yangoliver@gmail.com)
- Speed up the "crash --osrelease" option when used with "flattened"
format dumpfiles. Without the patch, the rearranged data array
initialization is performed before the vmcoreinfo data in the
header is read, which can take a significant amount of time with
large dumpfiles. The patch simply looks for the appropriate
vmcoreinfo data string near the beginning of the dumpfile.
(anderson@redhat.com)
- Fix for the initialization-time sorting mechanism required for
"flattened format" dumpfiles if the dumpfile is truncated/incomplete.
Without the patch, the sorting function continues performing invalid
reads beyond the of the dumpfile, which may lead to an infinite loop
instead of a session-ending error message. In addition, since the
sorting operation may take several minutes, a "please wait" message
with an incrementing percentage-complete counter will be displayed.
(anderson@redhat.com)
- Several fixes associated with the gathering and display of task
state. Without the patch:
(1) The "ps" command's ST column shows "??" for tasks in the
TASK_WAKING state.
(2) The "ps" command's ST column shows "??" for tasks in the
TASK_PARKED state in Linux 3.14 and later kernels.
(3) The STATE field of the initial system banner and the "set"
command are incorrect if the task state has the TASK_WAKING,
TASK_WAKEKILL modifier, or TASK_PARKED bits set in Linux 3.14
and later kernels.
(4) The "foreach DE" task identifier fails if a task with a PID
number of 0xDE (222) exists.
(5) The "foreach" command's "SW", "PA", "TR" and "DE" task
identifiers inadvertently select all tasks in kernel versions
that do not have those states.
(6) The "help -t" output would display incorrect values for the
TASK_WAKEKILL, TASK_WAKING and TASK_PARKED states in Linux 3.14
and later kernels.
Lastly, support for the TASK_NOLOAD modifier introduced in Linux 4.2
has been added to STATE field of the "set" command and the initial
system banner.
(anderson@redhat.com)
- Fix for the internal memory allocation functionality. Without the
patch, in the unlikely event where the GETBUF() facility has to
utilize malloc() to allocate a buffer, and CTRL-c is entered while
that buffer is being zeroed out before being returned to the caller,
it may result in a never-ending set of "malloc-free mismatch" error
messages.
(anderson@redhat.com)
- Fix for the PPC64 "bt" command for active non-panic tasks. Without
the patch, the backtrace may fail immediately with the error message
"bt: invalid kernel virtual address: f type: Regs NIP value".
(anderson@redhat.com)
- Fix for the "bt" command on little-endian PPC64 machines for tasks
that are blocked in __schedule(). Without the patch, there will be
two "__switch_to" frames displayed before the normal "__schedule"
frame that is used as the starting point for blocked tasks.
(anderson@redhat.com)
- Fix for the PPC64 "bt" command to align its exception frame verifier
function with the most recent version of the kernel's getvecname()
function, which was updated in Linux 3.12. Without the patch, the
"Hypervisor Decrementer", "Emulation Assist", "Hypervisor Doorbell",
"Altivec Unavailable", "Instruction Breakpoint", "Denormalisation",
"HMI" and "Altivec Assist" exception types are not recognized and
their exception frames not displayed; the "Doorbell" exception type
is marked as a "reserved" exception type,
(anderson@redhat.com)
- Fix for the "timer" command when run on a kernel with a large number
of cpus. Without the patch, the command may fail prematurely with
a dump of the internal crash utility allocated buffer statistics
followed by the message "timer: cannot allocate any more memory!",
(anderson@redhat.com)
- Commit f95ecdc330a11d3701de859aab59a5ab5954aae6, which speeds up
"crash --osrelease" for flattened format dumpfiles, inadvertently
broke the option for ELF kdump and compressed kdump dumpfiles.
(anderson@redhat.com)
- Implementation of two new "files" command options. The "files -c"
option is context-sensitive, similar to the the regular "files"
command when used without an argument, but replaces the FILE and
DENTRY columns with I_MAPPING and NRPAGES columns that reflect
each open file's inode.i_mapping address_space structure address,
and the address_space.nrpages count within it; this shows how
many of each open file's pages are currently in the system's
page cache. The "files -p <inode>" option takes the address
of an inode, and dumps all of its pages that are currently in the
system's page cache, borrowing the "kmem -p" page structure output.
(yangoliver@gmail.com)
- Modified the qualification for the execution of the "runq -g" option.
Without the patch, if the target kernel was not configured with both
CONFIG_FAIR_GROUP_SCHED and CONFIG_RT_GROUP_SCHED, the command fails
with the message "runq: -g option not supported or applicable on this
architecture or kernel". With this patch, if the kernel was built
with either CONFIG_FAIR_GROUP_SCHED or CONFIG_RT_GROUP_SCHED, the
command will execute.
(rabinv@axis.com)
- Fix for the error handling of the "foreach task -R struct.member"
format if an invalid structure and/or member is used as an argument.
Without the patch, the command will display the expected error
indicating "task: invalid structure member reference", but then will
be followed by a stream of "task: recursive temporary file usage"
error messages.
(anderson@redhat.com)
- Force the 32-bit MIPS extensions/eppic.so to be compiled with -m32.
This is required when "make extensions" is executed after the top
level crash binary has been built with "make TARGET=MIPS" on an
x86_64 host.
(rabinv@axis.com)
- If the starting hexadecimal address of a function is passed to the
"dis" command without a count argument, disassemble the entire
function -- similar to when a symbol name of a function is passed
without a count argument. Without the patch, only one instruction
is displayed.
(atomlin@redhat.com)
- Fix compiler warning generated by extensions/trace.c when compiled
with gcc version 5. Without the patch, the message "warning: the
use of 'mktemp' is dangerous, better use 'mkstemp'" is generated.
(anderson@redhat.com)
- Update the extensions/eppic.mk file to clone the eppic source code
from https://github.com/lucchouina/eppic.git.
(lucchouina@gmail.com)
- Export the previously static symbol_name_count() function, which
returns a count of symbols with the same name. Export a new
is_symbol_text() function, which checks whether specified symbol
entry is a type 't' or 'T'.
(atomlin@redhat.com, anderson@redhat.com)
- If a symbol or symbol+offset argument is passed to the "dis" command,
and there are multiple text symbols with the same symbol name, then
display a message indicating that there are "duplicate text symbols
found", followed by a list of the symbols. Without the patch, the
duplicate symbol with the lowest virtual address is used.
(atomlin@redhat.com, anderson@redhat.com)
(07/13/15)
7.1.1-1.fc23
- Fedora Rawhide build: crash-7.1.1-1.fc23 (05/28/15)
http://koji.fedoraproject.org/koji/buildinfo?buildID=639705
7.1.1 - Fix for two minor issues with the "net" command. Without the patch,
the "net -a" option appends its correct output with the command's
"Usage:" message; and if either the "net -x" or "net -d" options are
used without also specifying "-s" or "-S", the error message would
indicate "net: illegal flag: 800000" or "net: illegal flag: 1000000"
instead of showing the command's "Usage:" message.
(anderson@redhat.com)
- If the kernel (live or dumpfile) has the TAINT_LIVEPATCH bit set, or
if the Red Hat "kpatch" module is installed, the tag "[LIVEPATCH]"
will be displayed next to the kernel name in the initial system
banner and by the "sys" command. This new tag replaces the
"[KPATCH]" tag that was introduced in crash-7.0.7.
(anderson@redhat.com)
- Addressed three Coverity Scan complaints in vmware_vmss.c:
50:leaked_storage: Variable "fp" going out of scope leaks the
storage it points to.
53:leaked_storage: Variable "fp" going out of scope leaks the
storage it points to.
256:warning: Use of memory after it is freed
(anderson@redhat.com)
- Remove the LKCD-only "propeller spinner" seen when a dumpfile read
requires more than 2048 page header accesses. This was put in place
because of the non-random-access design of LKCD dumpfiles. Without
the patch, the spinner display is intermingled with command output,
which complicates the parsing of the output.
(watters.sam@gmail.com)
- Fix to support the Linux version increment from 3 to 4. Without the
patch, both dumpfile and live sessions fail during initialization,
issuing the message "WARNING: kernel version inconsistency between
vmlinux and dumpfile" or "WARNING: kernel version inconsistency
between vmlinux and live memory", followed by the nonsensical fatal
error message "crash: incompatible arguments: vmlinux is not SMP --
vmcore is SMP" or "crash: incompatible arguments: vmlinux is not
SMP -- live system is SMP". To prevent unexpected kernel version
bumps in the future, support has been added for version 5.
(anderson@redhat.com)
- Add support for more than 16TB of physical memory space in the SADUMP
dumpfile format. Without the patch, there is a limitation caused
by several 32-bit members of dump_header structure, in particular
the max_mapnr member, which overflows if the dumpfile contains more
than 16TB of physical memory space. The header_version member of
the dump_header structure has been increased from 0 to 1 in this
extended new format, and the new 64-bit members will be used.
(d.hatayama@jp.fujitsu.com)
- Fix for command lines that are redirected to a pipe. Without the
patch, if an external piped-to command contains a quoted string that
includes a "|" character, the command fails with the message "crash:
pipe operation failed".
(anderson@redhat.com)
- Fix for insecure temporary file usage in _rl_tropen() as reported by
readline library CVE-2014-2524.
(anderson@redhat.com)
- When the gdb-<version>.patch file has changed and a rebuild is
done from within a previously-existing build tree, the "patch -N"
option is used to ignore patches that have been previously applied;
this patch also applies the "patch -r-" option to prevent unnecessary
.rej files from being created.
(anderson@redhat.com)
- Fix to account for Xen hypervisor's "domain" structure member name
change from "is_paused_by_controller" to "controller_pause_count".
Without the patch, in Xen 4.2.5 and later, the crash session fails
during initialization with the error message 'crash: invalid
structure member offset: domain_is_paused_by_controller".
(dietmar.hahn@ts.fujitsu.com)
- During initialization, reject useless ARM64 "(A)" and "(a)" absolute
symbols that are below the text region. Without the patch, several
recently-introduced absolute symbols have been introduced into the
kernel, which will be displayed by "sym -l" prior to the first kernel
virtual address symbol, and will show up in command output where
memory values are translated into kernel symbol references.
(anderson@redhat.com)
- Fix for ARM64 kernels to account for changes in the virtual memory
layout introduced in Linux 3.17. The vmalloc region end address, and
the vmemmap start and end addresses are now calculated at kernel
build time, because they depend upon the size of a struct page.
Accordingly, the crash utility needs to calculate those three address
values dynamically, after the embedded gdb module has initialized.
Without the patch, reads of page structures return invalid data due
to incorrect virtual-to-physical translations of memory in the
vmemmap range. This in turn causes commands that require page
structure contents to fail or show invalid data, such as "kmem -p",
"kmem -[sS]", and the "kmem -[fF]" options.
(anderson@redhat.com)
- Fix to support ELF vmcore dumpfiles whose PT_LOAD file offset values
of their respective memory segments are not laid out sequentially
from low to high in the dumpfile. This has only been seen in ELF
dumpfiles created by VMware's "vmss2core -M" facility. Without the
patch, the crash session may fail during initialization, either with
the message "cannot malloc ELF header buffer", or "crash: <dumpfile>:
not a supported file format".
(anderson@redhat.com)
- Enhancement to the support of VMware .vmss suspended state dumpfiles.
There may be holes in the memory address saved for PCI, etc. In such
cases, the memory dump is divided into regions. With this patch, up
to 3 memory regions are supported.
(hfu@vmware.com)
- Fortified the error handling of task gathering from the pid_hash[]
chains during session initialization. If a chain has been corrupted,
the patch prevents the sequence from entering an infinite loop, and
the error messages associated with corrupt/invalid chains have been
updated to report the pid_hash[] index number.
(anderson@redhat.com)
- Implemented a new STRDUPBUF() utility that will duplicate an existing
string into a buffer allocated with GETBUF(). As is the case with
any buffer allocated with GETBUF(), it is only meant to exist during
the life-span of the current command. If it is not explicitly freed
via FREEBUF(), then it will be freed automatically prior to the next
command.
(anderson@redhat.com)
- Implemented a new fill_struct_member_data() function that gathers
a bundle of data that describes a structure member. The function
receives a pointer to a struct_member_data structure, in which the
caller has initialized the "structure" and "member" name pointers:
struct struct_member_data {
char *structure;
char *member;
long type;
long unsigned_type;
long length;
long offset;
long bitpos;
long bitsize;
};
A gdb "printm" command is crafted using those two fields, and the
output of the command is used to initialize the remaining six fields.
Adapted from Qiao Nuohan's "pstruct" extension module.
(anderson@redhat.com, qiaonuohan@cn.fujitsu.com)
- Implemented a new "runq -c cpu(s)" option to display the run queue
data of specified cpus. It can be used in conjunction with all runq
command options. The cpus must be specified in a comma- and/or
dash-separated list; for examples, "3", "1,8,9", "1-23", or "1,8-15".
(anderson@redhat.com)
- Build extension modules that utilize the generic extensions/Makefile
with -g. In addition, build the snap.c extension module with -g.
(rabinv@axis.com)
- Several fixes, updates, and enhancements for 32-bit MIPS support:
(1) The MIPS general purpose registers in the elf_gregset_t
don't start at index 0 but at index 6.
(2) Adjust for the kernel's pt_regs structure changes between
kernel versions. For example, fields are inserted into the
middle based on build time options, and the amount of padding
at the head of the structure was changed relatively recently.
To handle this, split the structure definition into two parts
and get the offsets of these two parts dynamically.
(3) Do not display each parsed kernel symbol during initialization
when invoked with "crash -d8".
(4) Add support for loading raw MIPS ramdump dumpfiles.
(5) Add support for compressed kdump dumpfiles.
(rabinv@axis.com)
- Fix for a typo in "help foreach", and a fix for a spelling error in
"help input".
(weijg.fnst@cn.fujitsu.com)
- Fix for "and and" and "the the" typos in the README file.
(weijg.fnst@cn.fujitsu.com)
- Fix to address the Xen 4.5.0 hypervisor symbol name change from
"dom0" to "hardware_domain". Without the patch, the crash session
fails with the error message "crash: cannot resolve: dom0".
(dslutz@verizon.com)
- Fix for a regression in crash-7.1.0 that causes failures when the
"crash -t" option is run on a live system, and when analyzing remote
Linux kernels. Without the patch, "crash -t" on a live system fails
with the message "crash: cannot open remote memory source: /dev/mem",
and attempts to analyze a Linux kernel remotely just shows the kernel
timestamp and exits immediately.
(dslutz@verizon.com, anderson@redhat.com)
- Speed up the session invocation time of "flattened" format dumpfiles
created by the makedumpfile(8) facility. When sorting the blocks of
memory by their intended ELF or compressed kdump file offsets, the
patch replaces the bubble-sort method that is currently used with an
insertion sort method.
(dslutz@verizon.com)
- Remove the non-existent "-L" option from the "ps" command's mutually-
exclusive options error message.
(vvs@parallels.com)
- Fix for the "irq", "mount", "kmem -p" and "kmem -v" commands when
they are used in an input file. If more than one of any of those
four commands are used in an input file, the output of the second
and subsequent command instances will not display their respective
command headers.
(anderson@redhat.com)
- Implemented a new "kmem -m" option that is similar to "kmem -p",
but it allows the user to specify the page struct members to be
displayed. The option takes a comma-separated list of one or
more page struct members, which will be displayed following the
page structure address. The "flags" member will always be expressed
in hexadecimal format, and the "_count" and "_mapcount" members will
always be expressed in decimal format. Otherwise, all other members
will be displayed in hexadecimal format unless the current output
radix is 10 and the member is a signed/unsigned integer. Members
that are data structures may be specified by the data structure's
member name, or expanded to specify a member of that data structure.
For example, "-m lru" refers to a list_head data structure, in which
case both the list_head.next and list_head.prev pointer values will
be displayed; if "-m lru.next" is specified, just the list_head.next
value will be displayed.
(atomlin@redhat.com, anderson@redhat.com)
- Support enhancement for the 32-bit MIPS architecture that retrieves
the per-cpu registers from the NT_PRSTATUS notes stored in the header
of compressed kdump dumpfiles.
(rabinv@axis.com)
- Fix to remove an invalid warning message on ARM64 if a crash session
is invoked with the "-d<number>" debug flag. Without the patch,
the invalid message is "WARNING: SPARSEMEM_EX: questionable section
values".
(anderson@redhat.com)
- Remove the leftover ".constructor" build file in the extensions
subdirectory when "make extensions" is complete, and update the
top-level .gitignore file to ignore post-build extensions
subdirectory files.
(anderson@redhat.com)
- Fix for a segmentation violation generated by the "help -[n|D]"
options on ARM64 compressed kdumps.
(anderson@redhat.com)
- Additional output for the "help [-D|-n]" options on ARM64. For ELF
kdump vmcores and compressed kdumps, the elf_prstatus structure in
each NT_PRSTATUS note will be translated.
(anderson@redhat.com)
- The "help -r" option has been extended to dump the ARM64 registers
stored in each per-cpu NT_PRSTATUS note in compressed kdump and
ELF kdump dumpfiles.
(anderson@redhat.com)
- Fix for the ARM64 page size determination on Linux 4.1 and later
kernels. Without the patch, the crash session fails during
initialization with the message "crash: invalid/unsupported page
size: 98304" on kernels with 64K pages. On kernels with 4K pages,
the message is "crash: invalid/unsupported page size: 6144". In
addition, the "-p <page-size>" command line override option
had no effect on ARM64; that has been fixed as well.
(anderson@redhat.com)
- Fix for the DATE display in the initial system banner and by the
"sys" command to account for the Linux 3.17 change that moved
the "timekeeper" symbol and structure into a containing tk_core
structure; the "shadow_timekeeper" timekeeper will be used as an
alternative. Without the patch, the DATE shows something within
a few hours of the Linux epoch, such as "Wed Dec 31 18:00:00 1969".
(kmcmartin@redhat.com)
- Fixes for the translation of ARM64 PTEs, as displayed by the "vm -p"
and "vtop" commands. Without the patch, if "vm -p" references a
swapped-out page on Linux 4.0 and later kernels, the SWAP location
may indicate "(unknown swap location)", and will show an invalid
OFFSET value; on Linux 3.13 and later kernels, running "vtop" on a
user virtual address incorrectly translates the PTE contents of
swapped out pages by showing a PHYSICAL address and FLAGS translation
instead of the SWAP device and OFFSET. It is possible that there may
be PTE bit translation errors on other kernel versions; the patch
addresses the changes in ARM64 PTE bit definitions made in Linux
3.11, 3.13, and 4.0 kernels.
(anderson@redhat.com)
- Enhanced the "struct.member" display capability of the "struct",
"union", "task", "list" and "tree" commands. If a specified
structure member contains an embedded structure, the output may
be restricted to just the embedded structure by expressing the
.member argument as "member.member". If a specified structure
member is an array, the output may be restricted to a single array
element by expressing the .member argument as "member[index]".
Furthermore, these embedded member specifications may extend beyond
one level deep, for example, by expressing the member argument as
"member.member.member", or "member[index].member".
(Alexandr_Terekhov@epam.com, anderson@redhat.com)
- Fix for any command that passes strings to gdb for evaluation,
where the string contains a parentheses-within-parentheses
expression along with a ">" or ">>" operator inside the outermost
set of parentheses. Without the patch, a command such as the
following fails like so:
crash> p ((1+1) >> 1)
p: gdb request failed: p ((1+1)
crash>
(anderson@redhat.com)
- Fix for the handling of ARM64 kernel module per-cpu symbols. Without
the patch, if the debuginfo data of an ARM64 kernel module that
contains a per-cpu section is loaded by "mod -s <module>" or
"mod -S", commands such as "bt" or "sym" may incorrectly translate
the module's virtual addresses to symbol names.
(Jan.Karlsson@sonymobile.com)
(05/27/15)
7.1.0-3.fc23
- Fedora Rawhide build: crash-7.1.0-3.fc23 (03/02/15)
http://koji.fedoraproject.org/koji/buildinfo?buildID=617219
- Fix to support the Linux version increment from 3 to 4. Without the
patch, both dumpfile and live sessions fail during initialization,
issuing the message "WARNING: kernel version inconsistency between
vmlinux and dumpfile" or "WARNING: kernel version inconsistency
between vmlinux and live memory", followed by the nonsensical fatal
error message "crash: incompatible arguments: vmlinux is not SMP --
vmcore is SMP" or "crash: incompatible arguments: vmlinux is not
SMP -- live system is SMP".
(anderson@redhat.com)
7.1.0-1.fc22
- Fedora Rawhide build: crash-7.1.0-1.fc22 (02/10/15)
http://koji.fedoraproject.org/koji/buildinfo?buildID=610187
7.1.0 - Support for "irq" and "irq -u" on the S390 and S390X architectures
if they are running Linux 3.12 and later kernels. Older kernels
without GENERIC_HARDIRQ support will fail with the error message
"irq: cannot determine number of IRQs".
(sebott@linux.vnet.ibm.com)
- Fix for the handling of multiple ramdump images. Without the patch,
entering more than one ramdump image on the command line may result
in a segmentation violation.
(oza@broadcom.com)
- Implemented the capability of building crash as an x86_64 binary
for analyzing little-endian PPC64 dumpfiles on an x86_64 host, which
can be done by entering "make target=PPC64". After the initial build
is complete, subsequent builds can be done by entering "make" alone.
(anderson@redhat.com)
- Fix for the "crash --log <dumpfile>" option on both of the PPC64
architectures. Without the patch, the command fails with the message
"crash: seek error: physical address: <address> type: log_buf
pointer", followed by "crash: cannot read log_buf value". This bug
was introduced in crash-7.0.0 by a patch that added support for the
PPC64 BOOK3E processor family.
(anderson@redhat.com)
- Fix for a misleading fatal error message if a 32-bit crash binary
built on an X86_64 host with "make target=X86" or "make target=ARM"
is used on a live X86_64 system without specifying a vmlinux
namelist. Without the patch, the session fails with the message
"crash: cannot find booted kernel -- please enter namelist argument".
The error message will be "crash: compiled for the X86 architecture"
or "crash: compiled for the ARM architecture".
(anderson@redhat.com)
- Fix for finding the starting stack and instruction pointer hooks for
the active tasks in x86_64 ELF or compressed dumpfiles created by the
KVM "virsh dump --memory-only" facility. Without the patch, the
backtraces of active tasks may show an invalid starting frame that
indicates "__schedule". The fix displays the exception RIP and dumps
the register contents that are stored in the dumpfile header. If the
active task was operating in the kernel, the backtrace continues from
there; if the task was operating in user-space, the backtrace is
complete at that point.
(anderson@redhat.com)
- Fix for the "waitq" command when it is passed the address of a
wait_queue_head_t structure. Without the patch, if the entries
on the list are dynamically-created __wait_queue structures on
kernel stacks, the tasks owning the kernel stack are not displayed.
(anderson@redhat.com)
- Implemented a new "net -n [pid|task]" option that displays the list
of network devices with respect the network namespace of the current
context, or that of a task specified by the optional "pid" or "task"
argument. The former "net -n <address>" option that translates
an IPv4 address expressed as a decimal or hexadecimal value into a
standard numbers-and-dots notation has been changed to "net -N".
(vvs@parallels.com)
- Fix for the kernel virtual address to symbol name translation for
special text region delimiter symbols declared in vmlinux.lds.S with
VMLINUX_SYMBOL(), such as __sched_text_start, __lock_text_start,
__kprobes_text_start, __entry_text_start and __irqentry_text_start.
Without the patch, if the addresses of those symbols are the same
value as the first "real" symbol in those text regions, commands
such as "dis" and "sym" may show the "_text_start" symbol name
instead of the desired text symbol name.
(qiaonuohan@cn.fujitsu.com, anderson@redhat.com)
- Enhancement of the "kmem -i" option to display memory overcommit
information, which will be appended to the traditional output of
the command. For example:
crash> kmem -i
PAGES TOTAL PERCENTAGE
TOTAL MEM 1965332 7.5 GB ----
FREE 78080 305 MB 3% of TOTAL MEM
USED 1887252 7.2 GB 96% of TOTAL MEM
SHARED 789954 3 GB 40% of TOTAL MEM
BUFFERS 110606 432.1 MB 5% of TOTAL MEM
CACHED 1212645 4.6 GB 61% of TOTAL MEM
SLAB 146563 572.5 MB 7% of TOTAL MEM
TOTAL SWAP 1970175 7.5 GB ----
SWAP USED 5 20 KB 0% of TOTAL SWAP
SWAP FREE 1970170 7.5 GB 99% of TOTAL SWAP
COMMIT LIMIT 2952841 11.3 GB ----
COMMITTED 1150595 4.4 GB 38% of TOTAL LIMIT
The COMMIT LIMIT and COMMITTED information is similar to that
displayed by the CommitLimit and Committed_AS lines in /proc/meminfo.
(atomlin@redhat.com)
- Fix for the "kmem [-s|-S] <address>" command, and the "rd -S[S]"
and "bt -F[F]" options. Without the patch, if the page structure
associated with a memory address still contains a (stale) pointer to
the address of a kmem_cache structure, but whose page.flags does not
have the PG_slab bit set, the address is incorrectly presumed to be
contained within that slab cache. As as result, the "kmem" command
may display one or more messages indicating a "bad inuse counter", a
"bad next pointer" or a "bad s_mem pointer", followed by an "address
not found in cache" error message. The "rd -S[S]" and "bt -F[F]"
commands may mislabel memory locations as belonging to slab caches.
(anderson@redhat.com)
- Added a new "vm -M <mm_struct>" option. When a task is exiting,
the mm_struct address pointer in its task_struct is NULL'd out, and
as a result, the "vm" command looks like this:
crash> vm
PID: 4563 TASK: ffff88049863f500 CPU: 8 COMMAND: "postgres"
MM PGD RSS TOTAL_VM
0 0 0k 0k
However, the mm_struct address can be retrieved from the task's
kernel stack and entered manually with this option, which allows the
"vm" command to attempt to dump the virtual memory data of the task.
It may, or may not, work, depending upon how far the virtual memory
deconstruction has proceeded. This option only verifies that the
address entered is from the "mm_struct" slab cache, and that
its mm_struct.mm_count is non-zero.
(qiaonuohan@cn.fujitsu.com, anderson@redhat.com)
- Fix for the X86_64 "bt" and "mach" commands when running against
kernels that have the following Linux 3.18 commit, which addresses
CVE-2014-9322. The kernel patch removes the per-cpu exception stack
used for handling stack segment faults:
commit 6f442be2fb22be02cafa606f1769fa1e6f894441
x86_64, traps: Stop using IST for #SS
Without this patch, backtraces that originate on any of the other 4
per-cpu exception stacks will be mis-labeled at the transition point
back to the previous stack. For example, backtraces that that
originate on the NMI stack will indicate that they are coming from
the "DOUBLEFAULT" stack. The patch examines all idt_table entries
during initialization, looking for gate descriptors that have
non-zero index values, and when found, pulls out out the handler
function address; from that information, the exception stack name
string array is properly initialized rather than being hard-coded.
This fix also properly labels the exception stack names on x86_64
CONFIG_PREEMPT_RT realtime kernels, which only utilize 3 exception
stacks instead of the traditional 5 (now 4 with this kernel commit),
instead of just showing "RT". Also, without the patch, the "mach"
command will mis-label the stack names when it displays the base
addresses of each per-cpu exception stack.
(anderson@redhat.com)
- Additional output for the "help [-D|-n]" options on X86 and X86_64
architectures. For compressed kdumps, the elf_prstatus structure in
each per-cpu NT_PRSTATUS note will be translated. For ELF kdumps,
the elf_prstatus structure in each per-cpu NT_PRSTATUS note, and
the QEMUCPUState structure in each per-cpu QEMU note, will be
translated.
(zhouwj-fnst@cn.fujitsu.com, anderson@redhat.com)
- Implemented a new "bt -A" option for the S390X architecture, which
adds support for displaying the new s390x vector registers. For
ELF dumps, the registers are taken from the VX ELF notes; for s390
dumps. the registers are taken from memory. The option produces the
same output as the -a option, but also displays the vector registers
for all active tasks.
(holzheu@linux.vnet.ibm.com)
- Fix for the 32-bit ARM virtual-to-physical address translation of
unity-mapped kernel virtual addresses in kernels configured with
CONFIG_ARM_LPAE if the system's phys_base exceeds 4GB.
(sdu.liu@huawei.com)
- Fix for the "help [-D|-n]" option on 32-bit X86 kernels that use the
64-bit ELF vmcore format generated by "virsh dump --memory-only".
Without the patch, the QEMUCPUState structures in QEMU notes are not
translated.
(qiaonuohan@cn.fujitsu.com)
- Additional output for the "help [-D|-n]" options on X86 and X86_64
architectures. For compressed kdumps generated by "virsh dump
--memory-only", the QEMUCPUState structure in each per-cpu QEMU
note will be translated, and the dumpfile offset address of each
QEMU note will be displayed.
(qiaonuohan@cn.fujitsu.com, anderson@redhat.com)
- Introduction of support for the 32-bit MIPS architecture. This
initial support is restricted to 32-bit MIPS kernels that are
configured as little-endian. With respect to dumpfile types, only
ELF vmcores are recognized. In addition to building crash as a
32-bit MIPS binary, it is also possible to build crash as an x86
binary on an x86 or x86_64 host so that crash analysis of MIPS
dumpfiles can be performed on an x86 or x86_64 host. The x86 binary
can be built by entering "make target=MIPS" for the initial build;
subsequent builds with MIPS support can be accomplished by entering
"make" alone.
(rabin@rab.in)
- Added support for big-endian 32-bit MIPS kernels. Only native MIPS
crash binaries may be built with big-endian support; running the
"make target=MIPS" build option on an x86 or x86_64 host creates
x86 binaries with little-endian support only.
(rabin@rab.in)
- Update the "ps" help page to reflect that the "ps -l" option may be
based upon the task_struct's sched_entity.last_arrival. Without the
patch, it indicates that either the task_struct's last_run or
timestamp value are used.
(anderson@redhat.com)
- Fix for the "kmem -z" option output to change the zone structure's
pages_scanned field from a signed to an unsigned long integer.
(Alexandr_Terekhov@epam.com)
- Fix for "kmem -z" option on Linux 2.6.30 and later kernels. Without
the patch, the zone structure's all_unreclaimable and pages_scanned
fields are not dumped.
(anderson@redhat.com)
- Fix for the PPC64 "bt" command on both big-endian and little-endian
architectures. Without the patch, backtraces of the active tasks
may be "empty" on little-endian machines, or show a one-liner of
the form: "#0 [c0000005f4db7a60] (null) at 501 (unreliable)" on
big-endian machines.
(anderson@redhat.com)
- Additional output for the "help [-D|-n]" options for the PPC64
architecture. For compressed kdump and ELF kdump dumpfiles, the
elf_prstatus structure in each per-cpu NT_PRSTATUS note will be
translated.
(anderson@redhat.com)
- The "help -r" option has been extended to dump the PPC64 registers
stored in each per-cpu NT_PRSTATUS note in compressed kdump and
ELF kdump dumpfiles.
(anderson@redhat.com)
- Prevent "help -r" and "help -[D|n]" from generating a segmentation
violation when attempting to access non-existent NT_PRSTATUS notes
for offline cpus in ELF or compressed kdumps.
(anderson@redhat.com)
- Fix for the "kmem -V" option output to change the display of the
vm_event_states fields from signed to unsigned long integers.
(adobriyan@gmail.com)
- Fix to allow the "ps -G" qualifier to be used in conjunction with
the "ps -p" option. Without the patch, "ps -G -p" fails with the
error message "ps: do_list: hash queue is in use?"
(anderson@redhat.com)
- Fix for the "runq" command on kernels that are configured with
CONFIG_RT_GROUP_SCHED=n. Without the patch, real-time tasks queued
on a per-cpu rt_rq.rt_prio_array will not be displayed under the
"RT PRIO_ARRAY" header.
(mty.shibata@gmail.com)
- Fix for a regression introduced in crash-7.0.9 when running on a live
32-bit ARM machine. Without the patch, a segmentation violation
is generated during session initialization.
(anderson@redhat.com)
- Enhancement of the "PANIC:" message displayed by the initial system
banner and by the "sys" command. Without the patch, many panic types
are categorized under the same generic message of the form:
PANIC: "Oops: 0000 [#1] SMP " (check log for details)
or in other types of crashes, no message is displayed at all. With
this patch, a more comprehensive search is made of the kernel log for
a more informative panic message.
(drc@yahoo-inc.com, anderson@redhat.com)
- Add appropriate checks for the MIPS architecture to allow extension
modules to be loaded with the "extend" command.
(rabin@rab.in)
- Update the extensions/trace.c extension module to account for the
movement of the ftrace_event_call.name member into an anonymous
union in Linux 3.15, commit de7b2973903c6cc50b31ee5682a69b2219b9919d.
(rabin@rab.in)
- Added support for VMware .vmss suspended state files as dumpfiles.
Similar to all other supported dumpfile types, it is invoked as:
$ crash vmlinux <vmname>.vmss
A "<vmname>.vmss" file created by the VMware vSphere ESX hypervisor
contains a header and the full memory image. A "<vmname>.vmss" file
created by the VMware Workstation facility only contains the header,
and must be accompanied by a companion "<vmname>.vmem" memory image
that is located in the same directory as the "<vmname>.vmss" file.
(hfu@vmware.com)
(02/06/15)
7.0.9-1.fc22
- Fedora Rawhide build: crash-7.0.9-1.fc22 (11/14/14)
http://koji.fedoraproject.org/koji/buildinfo?buildID=593290
7.0.9 - Fix the CPU timer and clock comparator output for the "bt -a" command
on S390X machines. The output of CPU timer and clock comparator has
always been incorrect because:
- We added S390X_WORD_SIZE (8) instead of 4 to get the second word
- We did not left shift the clock comparator by 8
The fix gets the complete 64 bit values and by shifting the clock
comparator correctly.
(holzheu@linux.vnet.ibm.com)
- Add "/lib/modules/<version>/build" to the list of directories that
are searched for the currently-running kernel on live systems. This
will automatically locate the vmlinux namelist for kernels that were
locally installed with "make modules_install install".
(lrintel@redhat.com)
- Addressed 3 Coverity Scan issues:
(1) task.c: initialize the "curr" and "curr_my_q" variables in the
dump_tasks_in_task_group_cfs_rq() function.
(2) ramdump.c: make the "rd" and "len" return values from read()
and write() calls in write_elf() to be ssize_t types.
(3) cmdline.c: make the parsed PATH string buffer equal to the size
of the PATH string + 1 to prevent a possible buffer overflow
when a command line starts with a "!".
(anderson@redhat.com)
- Fix for the one-time (dumpfile), or as-required (live system),
gathering of tasks from the kernel pid_hash[] in 2.6.24 and later
kernels. Without the patch, if an entry in a pid_hash[] chain is
not related to the "init_pid_ns" pid_namespace structure, any
remaining entries in the hlist chain are skipped.
(vvs@parallels.com)
- Update the "extensions/snap.mk" file to allow the "snap.so" extension
module to be built outside of a crash source tree on a ppc64le PPC64
little-endian host. Without the patch, "make -f snap.mk" would fail
to compile, indicating "gcc: error: macro name missing after '-D'"
(anderson@redhat.com)
- Improve the method for determining whether a 32-bit ARM vmlinux is
an LPAE enabled kernel by first checking whether CONFIG_ARM_LPAE
exists in the vmcoreinfo data, and if it does not, by then checking
whether the next higher symbol above "swapper_pg_dir" is 0x5000 bytes
higher in value.
(sdu.liu@huawei.com)
- Fix "defs.h" for building extension modules outside of the crash
utility source tree on PPC and PPC64 machines. Without the patch,
both PPC and PPC64 will get #define'd if the extension module build
procedure does not #define one or the other, which in turn causes
multiple conflicting declarations.
(anderson@redhat.com)
- Fix for the "ps" command performance degradation patch the was
introduced in crash-7.0.8. Without this patch, it is possible that
the "ps" command may fail prematurely with the error message
"ps: bsearch for tgid failed: task: <address> tgid: <number>"
when running on a live system or against a "live" dumpfile.
(panfy.fnst@cn.fujitsu.com)
- Set the 32-bit ARM HZ value to a default value of 100 if the kernel
was not configured with CONFIG_IKCONFIG. Without the patch, the
initial system banner and the "sys" command show "UPTIME: (cannot
calculate: unknown HZ value)", the "ps -t" option shows "RUN TIME:
(cannot calculate: unknown HZ value)", and the "timer -r" option
kills the crash session with a floating point exception.
(hukeping@huawei.com)
- Fix the error message displayed if the vmlinux or vmcore file is
not the same endian as the crash utility binary. Without the patch
the filename is shown with the incorrect/opposite endian type.
(hukeping@huawei.com)
- Update the "ps" command's "ST" task state display to recognize the
TASK_PARKED state in Linux 3.9 and later kernels. Without the patch,
the command's "ST" column entry for parked tasks shows "??". The
state column will now show "PA", and the foreach command will accept
"PA" as a "state" argument.
(anderson@redhat.com)
- Fortify the protection against the use of an invalid/corrupted
CONFIG_SLAB kmem_cache per-cpu array_cache.limit value during
session initialization. In a recently seen vmcore, several of the
array_cache.limit values were corrupted such that they were stored
as negative values, which in turn caused the "kmem -[sS]" options
to fail immediately with a dump of the internal memory buffer
allocation statistics and the error message "kmem: cannot allocate
any more memory!".
(anderson@redhat.com)
- Implement a new "offline" internal crash variable that can be set to
either "show" (the default) or "hide". When set to "hide", certain
command output associated with offline cpus will be hidden from view,
and the output will indicate that the cpu is "[OFFLINE]". The new
variable can be set during invocation on the crash command line via
the option "--offline [show|hide]". During runtime, or in a .crashrc
or other crash input file, the variable can be set by entering
"set offline [show|hide]". The commands or options that are affected
when the variable is set to "hide" are as follows:
o On X86_64 machines, the "bt -E" option will not search exception
stacks associated with offline cpus.
o On X86_64 machines, the "mach" command will append "[OFFLINE]"
to the addresses of IRQ and exception stacks associated with
offline cpus.
o On X86_64 machines, the "mach -c" command will not display the
cpuinfo_x86 data structure associated with offline cpus.
o The "help -r" option has been fixed so as to not attempt to
display register sets of offline cpus from ELF kdump vmcores,
compressed kdump vmcores, and ELF kdump clones created by
"virsh dump --memory-only".
o The "bt -c" option will not accept an offline cpu number.
o The "set -c" option will not accept an offline cpu number.
o The "irq -s" option will not display statistics associated with
offline cpus.
o The "timer" command will not display hrtimer data associated
with offline cpus.
o The "timer -r" option will not display hrtimer data associated
with offline cpus.
o The "ptov" command will append "[OFFLINE]" when translating a
per-cpu address offset to a virtal address of an offline cpu.
o The "kmem -o" option will append "[OFFLINE]" to the base per-cpu
virtual address of an offline cpu.
o The "kmem -S" option in CONFIG_SLUB kernels will not display
per-cpu data associated with offline cpus.
o When a per-cpu address reference is passed to the "struct"
command, the data structure will not be displayed for offline
cpus.
o When a per-cpu symbol and cpu reference is passed to the "p"
command, the data will not be displayed for offline cpus.
o When the "ps -[l|m]" option is passed the optional "-C [cpus]"
option, the tasks queued on offline cpus are not shown.
o The "runq" command and the "runq [-t/-m/-g/-d]" options will not
display runqueue data for offline cpus.
o The "ps" command will replace the ">" active task indicator to
a "-" for offline cpus.
The initial system information banner and the "sys" command will
display the total number of cpus as before, but will append the count
of offline cpus. Lastly, a fix has been made for the initialization
time determination of the maximum number of per-cpu objects queued
in a CONFIG_SLAB kmem_cache so as to continue checking all cpus
higher than the first offline cpu. These changes in behavior are not
dependent upon the setting of the crash "offline" variable.
(qiaonuohan@cn.fujitsu.com)
- Adjustment to the "offline" patch-set to make the initial system
banner, the "sys" command, and the X86_64 "mach" command, to only
show the "OFFLINE" cpu count if there are actually offline cpus.
(anderson@redhat.com)
- Make the "bt -E" option conform to a "-c cpu(s)" specification when
the the two options are used together. Without the patch, "bt -E"
ignores a cpu specifier.
(anderson@redhat.com)
- Fix for the determination of the cpu count on 32-bit ARM machines.
Without the patch, if certain patterns of cpus are offline, the count
may be too small, causing cpu-dependent commands to not recognize
online cpus.
(Jan.Karlsson@sonymobile.com, anderson@redhat.com)
- Fix for a missing exception frame dump by the X86_64 "bt" command
when an IRQ is received while a task is running on its per-cpu
interrupt stack with interrupts enabled.
(anderson@redhat.com)
- Fix for the determination of the cpu count on ARM64 machines.
Without the patch, if certain patterns of cpus are offline, the count
may be too small, causing cpu-dependent commands to not recognize
online cpus.
(Jan.Karlsson@sonymobile.com, anderson@redhat.com)
- Fix for a possible SIGSEGV generated during session initialization
while "please wait... (determining panic task)" is being displayed.
This was caused by a patch introduced in crash-7.0.8, and can only
happen when analyzing dumpfiles whose header does not contain the
requisite information to determine the panic task and the active
tasks do not have any crash-related traces in their kernel stacks.
It should be noted that the SIGSEGV can be avoided by entering
"--no_panic" on the crash command line.
(anderson@redhat.com)
- Fix for a SIGSEGV generated by the "bt -a" or "help -r" commands
if the NT_PRSTATUS notes in a compressed kdump are invalid/corrupt.
If all cpus are online but the dumpfile initialization that cycles
through the NT_PRSTATUS notes does not find exactly one note per
cpu, then the register contents in those notes should not be used.
(anderson@redhat.com)
- Fix for data access from "split" compressed kdump dumpfiles. Without
the patch, if a dumpfile read targets physical memory in the first
memory page stored in the second or later sequential split dumpfile,
incorrect data will be returned.
(qiaonuohan@cn.fujitsu.com)
- Correction of the copyright and authorship of ramdump.c.
(oza@broadcom.com)
- Added recognition of the new DUMP_DH_COMPRESSED_INCOMPLETE flag in
the header of compressed kdumps, and the new DUMP_ELF_INCOMPLETE flag
in the header of ELF kdumps. If the makedumpfile(8) facility fails
to complete the creation of compressed or ELF kdump vmcore files
due to ENOSPC or other error, it will mark the vmcore as incomplete.
If either flag is set, the crash utility will issue a warning that
the dumpfile is known to be incomplete during initialization, just
prior to the system banner display. When reads are attempted on
missing data, a read error will be returned. As an alternative,
zero-filled data will be returned if the "--zero_excluded" command
line flag is used, or the "zero_excluded" runtime variable is set
to "on". In either case, the read errors or zero-filled memory
may cause the crash session to fail entirely, cause commands to
fail, or may result in other unpredictable runtime behavior.
(anderson@redhat.com, zhouwj-fnst@cn.fujitsu.com)
- If a kernel has been configured with CONFIG_DEBUG_INFO_REDUCED, then
the crash utility will fail to initialize, typically with a message
indicating "no debugging data available". However, it has been
reported (on a 32-bit ARM system) that the initialization sequence
continued on beyond that message point, and the session failed later
on with the message "neither runqueue nor rq structures exist". As
an aid to understanding why the session failed, if the target kernel
is configured with CONFIG_IKCONFIG, and CONFIG_DEBUG_INFO_REDUCED has
been set to "y", a relevant warning message will be displayed.
(anderson@redhat.com)
- Implemented support for this Linux 3.18 commit for kernels that are
configured with CONFIG_SLAB:
commit bf0dea23a9c094ae869a88bb694fbe966671bf6d
mm/slab: use percpu allocator for cpu cache
The commit above redesigned the kmem_cache.array_cache[] from a
hardwired array to a per-cpu pointer referencing external array_cache
structures. Without the patch, the crash session would fail during
initialization with the message "crash: cannot resolve cache_cache".
Note that it could be worked around by using the "--no_kmem_cache"
command line option, with a resulting loss of functionality for
commands requiring slab-related data.
(anderson@redhat.com)
- Implemented a new "sys -t" option that displays kernel taint
information. If the "tainted_mask" symbol exists, the option will
show its hexadecimal value and translate each bit set to the symbolic
letter of the taint type. On kernels prior to 2.6.28 which had the
"tainted" symbol, only its hexadecimal value is shown. The relevant
kernel sources should be consulted for the meaning of the letter(s)
or hexadecimal bit value(s).
(anderson@redhat.com)
- Cosmetic fix for the "help -[n|D]" translation of the bitmap contents
of the kdump_sub_header.dump_level flag in compressed kdump dumpfiles.
(anderson@redhat.com)
- Fix for the support of compressed kdump clones created with the KVM
"virsh dump --memory-only --format <compression-type>" command,
where the compression-type is either "kdump-zlib", "kdump-lzo" or
"kdump-snappy". Without the patch, if an x86_64 guest kernel was loaded
with a non-zero "phys_base", the "--machdep phys_base=<offset>" command
line option was required as a workaround or the crash session would fail
with the warning message "WARNING: cannot read linux_banner string"
followed by the fatal error message "crash: vmlinux and <dumpfile name>
do not match!".
(anderson@redhat.com)
(11/13/14)
7.0.8-1.fc22
- Fedora Rawhide build: crash-7.0.8-1.fc22 (09/15/14)
http://koji.fedoraproject.org/koji/buildinfo?buildID=577451
7.0.8 - Fix for the handling of 32-bit ELF xendump dumpfiles if the guest
was configured with more than 4GB of memory. Without the patch, the
crash session may fail during initialization with the error message
"crash: vmlinux and <dumpfile> do not match!".
(dslutz@verizon.com)
- Fix for file-handling errors when a compressed vmlinux.debug file
is followed by a vmlinux file on the crash command line. When the
crash session ends, two errors will occur:
(1) the vmlinux file will be deleted
(2) the temporary uncompressed version of the vmlinux.debug file
will remain in /var/tmp
This problem also occurs in the highly unlikely case where a
compressed vmlinux file is followed by a vmlinux.debug file on the
command line, and the uncompressed temporary version of the vmlinux
file is larger than the vmlinux.debug file. In that case:
(1) the vmlinux.debug file will be deleted
(2) the temporary uncompressed version of the vmlinux file
will remain in /var/tmp
(dmair@suse.com)
- Fix for the "search -t" option if the system has 2064 or more tasks.
Without the patch, the command fails with a dump of the crash utility
memory allocation statistics, ending with "search: cannot allocate
any more memory!".
(anderson@redhat.com)
- Fix for the "mod -S" command to find the debuginfo data for Red Hat
"kpatch" modules. Without the patch, the command would display
"mod: cannot find or load object file for <kpatch-module> module".
(anderson@redhat.com)
- Deprecated the "mount -f" option for Linux 3.13 and later kernels
containing commit eee5cc2702929fd41cce28058dc6d6717f723f87, which
removed the super_block.s_files list_head member and the open files
list that it contained. Without the patch, the command option fails
with the error message "mount: invalid structure member offset:
super_block_s_files"
(anderson@redhat.com)
- If a compressed kdump is damaged/truncated such that the bitmap data
in the dumpfile header is not contained within the file, attempts
to analyze it with a vmlinux file, or using the "crash --osrelease"
or "crash --log" options with just the vmcore, will result in the
crash utility spinning forever, endlessly performing reads of 0 bytes
from the file without recognizing the EOF condition.
(dwysocha@redhat.com)
- Fix for an ARM64 compilation failure of the embedded gdb file
"aarch-linux-nat.c" in the Fedora fc21 rawhide environment, which
uses glibc-headers-2.19.90-24.fc21.
(anderson@redhat.com)
- Document the reason behind the deprecation of the "mount -f" option
for Linux 3.13 and later kernels if the option is attempted, and in
the "help mount" output, similar to the deprecated "mount -d" option.
(anderson@redhat.com)
- During initialization, reject useless ARM64 "(A)" absolute symbols
that begin with "__crc_". Without the patch, several thousand of
them may be displayed by "sym -l" prior to the first kernel virtual
address symbol.
(anderson@redhat.com)
- When running against an ARM64 dumpfile created with the "snap.so"
extension module, do not attempt to read the crash_notes. Since the
dumpfile was taken while running on a live system, the crash_notes,
if configured into the kernel, would not contain valid data. Without
the patch, the message "WARNING: could not retrieve crash_notes" is
displayed during session initialization.
(anderson@redhat.com)
- Determine the various ARM64 kernel virtual address ranges using the
kernel's VA_BITS value. It currently is hardwired in the kernel to
one of two values depending upon whether 4K or 64K pages are
configured. However, there are plans to support 16K paqes, to make
VA_BITS a configurable value, and to make the number of page-table
levels configurable. Towards that end, the crash utility has been
changed to determine the VA_BITS value based upon known kernel
virtual addresses, and to then calculate the relevant kernel virtual
address ranges on that value instead of hardwiring them based upon
the page size.
(anderson@redhat.com)
- Enhancement to the "kmem -S" option for Linux 3.2 and later kernels
configured with CONFIG_SLUB to display the address of each per-cpu
kmem_cache_cpu address and the contents of its per-cpu partial list.
(qiaonuohan@cn.fujitsu.com)
- If an ARM or ARM64 dumpfile does not contain the register sets of
the active tasks in the kernel's per-cpu crash_notes, there is an
initialization-time warning message indicating "could not retrieve
crash_notes". It has been changed to a more meaningful warning
message indicating "cannot retrieve registers for active tasks".
(anderson@redhat.com)
- Implement support for ARM and ARM64 raw RAM dumpfiles. One or
more "ramdump" files may be entered on the crash command line
in an ordered pair format consisting of the RAM dump filename
and the starting physical address expressed in hexadecimal,
connected with an ampersand:
$ crash vmlinux ramdump@address [ramdump@address]
A temporary ELF header will be created in /var/tmp, and the
combination of the header and the ramdump file(s) will be handled
like a normal ELF vmcore. The ELF header will only exist during
the crash session. If desired, an optional "-o <filename>"
may be entered to create a permanent ELF vmcore file from the
ramdump file(s).
(vinayakm.list@gmail.com, paawan1982@yahoo.com, anderson@redhat.com)
- Fix for the "help -[nD]" ELF header translation to recognize the
EM_ARM and EM_AARCH values as "e_machine" types, and ELFOSABI_LINUX
as an "e_ident[EI_OSABI]" type. Without the patch, the e_machine
translation would show "40 (unsupported)" for 32-bit ARM, or
"183 (unsupported)" on ARM64; and the ELFOSABI_LINUX type would
be translated as "3 (?)".
(anderson@redhat.com)
- Re-run a command in the history list by entering an "!" followed by
the number identifying the command. However, unlike the similar "r"
pseudo-command, if the number is a command name in the user's PATH,
maintain the current behavior and execute that command.
(anderson@redhat.com)
- Fix to recognize that the live system "crash.ko" memory driver may
be compressed and named "crash.ko.xz". Without the patch, the driver
is not recognized and loaded, and as a result the /dev/mem driver
and/or /proc/kcore will be tried as the live memory source.
(anderson@redhat.com)
- On a live system during session initialization, delay the first read
error message (typically when reading the "cpu_possible_mask") until
it is confirmed that all of the following are true:
(1) /dev/crash does not exist, and
(2) /dev/mem is restricted via CONFIG_STRICT_DEVMEM, and
(3) /proc/kcore cannot be read/accessed.
The "kernel may be configured with CONFIG_STRICT_DEVMEM" and
the "trying /proc/kcore as an alternative" messages will still
be displayed when appropriate. The read error message be displayed
only if all three live memory read options fail.
(anderson@redhat.com)
- Fortify the validity verification of the data structures traversed
by the "kmem [-sS]" options for kernels configured with CONFIG_SLUB.
Without the patch, the contents of several structure members are not
validated, and may generate bogus or never-ending output, typically
seen when running the commands on a "live dump" where the dumpfile
was taken while the kernel was still running. The patch aborts the
relevant parts of per-kmem_cache output when invalid data is
encountered or if an object list contains duplicate entries, and
error messages have been enhanced to more accurately describe the
issues encountered.
(anderson@redhat.com)
- Implement support for the ppc64le PPC64 little-endian architecture.
Since this required a large number of patches to be applied to
architecture-neutral files in the gdb-7.6 tree, the changes are
only applied if the host build system is a ppc64le.
(ptesarik@suse.cz, normand@linux.vnet.ibm.com)
- Fix for SMP active task register-gathering from "kvmdump" dumpfiles
that were created with a cpu version id of 12 or greater that contain
additional XSAVE related fields in their cpu device headers. Without
the patch, active tasks running on cpus above 0 may have truncated
backtraces.
(uobergfe@redhat.com)
- Maintain backwards-compatibility for "kvmdump" dumpfiles that were
created by older development versions of KVM tools in which the
cpu version id was 12, but the cpu device headers did not contain
the additional XSAVE related fields.
(uobergfe@redhat.com)
- Address a "ps" command performance degradation that was introduced by
a crash-7.0.4 patch which added per-thread task_struct.rss_stat page
counts to the task's mm_struct.rss_stat page counts in order to show
an accurate/synchronized RSS value. Without the patch, the "ps"
command performance would degrade as the number of tasks increased,
most notably when there were thousands of tasks.
(panfy.fnst@cn.fujitsu.com, anderson@redhat.com)
(09/11/14)
7.0.7-1.fc21
- Fedora Rawhide build: crash-7.0.7-1.fc21 (06/11/14)
http://koji.fedoraproject.org/koji/buildinfo?buildID=537803
7.0.7 - Export the static ELF and compressed kdump vmcoreinfo_read_string()
functions from netdump.c and kdump.c via a new read_vmcoreinfo()
method in the global program_context structure. The function
get_log_from_vmcoreinfo() will access vmcoreinfo data via the
new pointer instead of requiring its callers to pass pointers to
their dumpfile-specific function.
(anderson@redhat.com)
- Linux 3.15 and later kernels configured with CONFIG_RANDOMIZE_BASE
can be now be readily identified because of new kernel symbols that
have been added. For those kernels, the new "--kaslr=<offset>"
and/or "--kaslr=auto" options are not necessary for ELF or compressed
kdump vmcores, or for live systems that have /proc/kallsyms showing
the relocated symbol values. A new KASLR initialization function
called kaslr_init() is now called by symtab_init() prior to the
initial symbol-sorting operation. If kaslr_init() determines that
KASLR may be in effect, it will trigger a search for the relevant
vmlinux symbols during the sorting operation, which in turn will
cause the relocation value to be automatically calculated.
(anderson@redhat.com)
- Implemented a new "bt -c cpu(s)" option to display the backtrace
of the active task on one or more cpus. The cpus must be specified
in a comma- and/or dash-separated list; for examples "3", "1,8,9",
"1-23", or "1,8,9-14". Similar to "bt -a", the option is only
applicable with crash dumps.
(atomlin@redhat.com)
- Fix for Linux 3.11 and later ARM kernels, in which all non-panicking
cpus offline themselves during a kdump procedure. This causes an
invalid cpu count determination during crash session initialization
from an ARM vmcore. The patch utilizes the cpu count found in the
cpu_active_map if it is greater than the count in the cpu_online_map.
In addition, the maximum NR_CPUS value for the ARM architecture has
been raised from 4 to 32.
(sdu.liu@huawei.com)
- Fix for the X86_64 "bt" command on Linux 3.3 and later kernels to
properly display exception frame register contents on NMI stacks.
Kernel commit 3f3c8b8c4b2a34776c3470142a7c8baafcda6eb0 added 12 more
values to the NMI exception stack to handle nested NMIs caused by
page faults or breakpoints that could occur while handling an NMI
exception. The fix has two parts:
1. Determine if this kernel has the nested NMI layout and set a
machine-specific flag (NESTED_NMI) if it does.
2. When backtracing an NMI stack, use the saved values instead of
those found at the top of stack.
Kernel commit 28696f434fef0efa97534b59986ad33b9c4df7f8 changed
the stack layout again, swapping the location of the "saved" and
"copied" registers. This can be detected automatically, because the
"copied" registers contain either a copy of the "saved" registers,
or point to "repeat_nmi". So, if "repeat_nmi" is found as the return
address, assume that this is the old layout, and adjust the stack
pointer again. Without the patch, incorrect register values are
displayed in the exception frame dump in the NMI stack backtrace.
(ptesarik@suse.cz)
- Fix for the built-in "g" alias, which apparently has not worked
correctly since crash-5.1.4. Without the patch, if the "g" alias
and the first argument are separated by one space, then the first
first character of that argument would get stripped prior to being
passed to the embedded gdb module.
(anderson@redhat.com)
- Removed the BASELEVEL_REVISION string from defs.h, which serves no
purpose since the deprecation of the remote daemon, and typically
has been out of sync with the crash version.
(anderson@redhat.com)
- Fix for the "p", "irq", "struct", "union" and "*" commands if a
cpu specification contains an invalid cpu number. Without the
patch, a segmentation violation may be generated.
(anderson@redhat.com)
- Implemented a new capability for the "ptov" command that takes a
per-cpu offset and cpu specification argument and translates it
into the kernel virtual addresses for the cpus specified.
(anderson@redhat.com)
- Implemented a new "ps -m" option that is a similar, complementary
option to "ps -l", but which translates the task timestamp value from
a decimal or hexadecimal nanoseconds value into a more human-readable
string consisting of the number of days, hours, minutes, seconds and
milliseconds that have elapsed since the task started executing on a
cpu. More accurately described, it is the time difference between
the timestamp copied from the per-cpu runqueue clock when the task
last started executing compared to the most current value of the
per-cpu runqueue clock.
(anderson@redhat.com, bud.brown@redhat.com)
- In addition, a new "ps -C <cpu-specifier>" option has been added
that can only be used with "ps -l" and "ps -m", which sorts the
global task list into per-cpu blocks; the cpu-specifier uses the
standard comma or dash separated list, expressed as "-C 1,3,5",
"-C 1-3", "-C 1,3,5-7,10", or "-Call" or "-Ca" for all cpus.
(anderson@redhat.com)
- Implemented a new "runq -m" option that is a simliar, complementary
option to "runq -t", but which displays the amount of time that the
active task on each cpu has been running, expressed in a format
consisting of days, hours, minutes, seconds and milliseconds.
(anderson@redhat.com)
- Implemented a new "kmem -h" option that displays the address of
each hugepage hstate array entry, its hugepage size, its free and
total counts, and name string.
(anderson@redhat.com)
- Implemented a new "ps -S" option that displays a summary consisting
of the number of tasks in a task state.
(anderson@redhat.com)
- Fix for the "arguments-input-file" feature to protect against a
called command modifying an argument string. For example, the
"struct" command modifies "-l struct_name.member" argument strings,
and so without the patch, all iterative calls after the first one
will fail.
(anderson@redhat.com)
- Fix failure to build from source when compiling the crash utility
with gcc-4.9. Without the patch, the crash utility build generates
the following error:
In file included from opncls.c:26:0:
opncls.c: In function 'bfd_fopen':
bfd.h:529:65: error: right-hand operand of comma expression has no
effect [-Werror=unused-value]
#define bfd_set_cacheable(abfd,bool) (((abfd)->cacheable = bool), TRUE)
^
opncls.c:263:5: note: in expansion of macro 'bfd_set_cacheable'
bfd_set_cacheable (nbfd, TRUE);
cc1: all warnings being treated as errors
(anderson@redhat.com, anatol.pomozov@gmail.com)
- Fix for displaying enum values that are greater than 32-bits in
size. Without the patch, the upper 32-bits are clipped off and
displayed as integer-sized value.
(anderson@redhat.com)
- If the kernel (live or dumpfile) has the "kpatch" module installed,
the tag "[KPATCH]" will be displayed next to the kernel name in the
initial system banner and by the "sys" command.
(anderson@redhat.com)
- Fix for the "DEBUG KERNEL:" display in the initial system banner
and by the "sys" command when using a System.map file with a
Linux 3.0 and later debug kernel. Without the patch, the kernel
version is not displayed in parentheses following the debug kernel
name.
(anderson@redhat.com)
- If the gdb-<version>.patch file has changed and a rebuild is being
done from within a previously-existing build tree, "patch -N" the
gdb sources, and start the rebuild from the gdb-<version> directory
instead of the gdb-<version>/gdb directory.
(anderson@redhat.com)
- Fix to prevent a possible segmentation violation generated by the
"runq -g" command when run on a very active live system due to an
active task on a cpu exiting while the command is running.
(anderson@redhat.com)
- Fix for the "runq -g" command on Linux 3.15 and later kernels, where
the cgroup_name() function now utilizes kernfs_name(). Without the
patch, the command fails with the error message "runq: invalid
structure member offset: cgroup_dentry".
(anderson@redhat.com)
- Fix for the "extend" command when running with an x86_64 crash binary
that was built with "make target=ARM64" in order to analyze ARM64
dumpfiles on an x86_64 host. Without the patch, if the extend
command is used with an extension module built in the same manner,
it fails with the message "extend: <module>.so: not an ELF format
object file".
(Jan.Karlsson@sonymobile.com)
- Introduce support for 32-bit ARM kernels that are configured with
CONFIG_ARM_LPAE. The patch implements the virtual-to-physical
address translation of 64-bit PTEs used by ARM LPAE kernels.
(sdu.liu@huawei.com, weijitao@huawei.com)
(06/09/14)
7.0.6 - Fix for custom X86_64 kernels that change the declaration of the
context_switch() function so that it is not an inline function.
Without the patch, the message "crash: cannot determine thread return
address" is displayed during invocation, and backtraces of blocked
tasks may have missing or invalid frames.
(ahonig@google.com)
- Fix to prevent a possible invocation-time error on Linux 3.7 and
later kernels configured with CONFIG_SLAB, running against vmcore
files filtered with the makedumpfile(8) facility. Without the
patch, the message "crash: page excluded: kernel virtual address:
<address> type: kmem_cache buffer" is immediately followed by
the message "crash: unable to initialize kmem slab cache subsystem".
Because of a kernel data structure name change from "cache_cache" to
"kmem_cache_boot", the crash utility failed to properly downsize
the stored size of the kernel's kmem_cache data structure from the
size indicated by the vmlinux debuginfo data. This in turn could
lead to reading beyond the end of a kmem_cache data structure into
a page of memory that had been excluded from the vmcore. The fix
was also applied to kernels configured with CONFIG_SLUB.
(anderson@redhat.com)
- Added a new "--kaslr <offset>" command line option for X86_64
kernels that are configured with CONFIG_RANDOMIZE_BASE. The offset
value must be equal to the difference between the symbol values
compiled into the vmlinux file and their relocated KASLR values.
(ahonig@google.com, anderson@redhat.com)
- Added a new "--kaslr=auto" command line option for X86_64 kernels
that that are configured with CONFIG_RANDOMIZE_BASE. When set to
"auto", the KASLR relocation value will be determined automatically
by comparing the "_stext" symbol value compiled into the vmlinux file
with the _stext symbol value stored in kdump vmcoreinfo data; on live
systems the comparison will be made with the "_stext" symbol value
that is found in /proc/kallsyms.
(ahonig@google.com, anderson@redhat.com)
- Enable kernel text line number capability for the "dis -l", "bt -l",
"sys -c", and "sym" commands for kernels that are configured with
CONFIG_RANDOMIZE_BASE.
(anderson@redhat.com)
- Fix for the "crash --log vmcore" command to account for the kernel
data structure and VMCOREINFO string name changes from "log" to
"printk_log" in Linux 3.11-rc4 and later kernels. Without the patch,
the command fails with the error message "crash: VMCOREINFO: no log
buffer data".
(anderson@redhat.com)
- Adjustment to the internal symbol-handling to prevent the usage of
kernel system call alias/wrapper names, for examples, "SyS_read" and
"compat_SyS_futex" instead of "sys_read" and "compat_sys_futex".
Without the patch, commands such as "dis", "sym <address>", and
"sys -c" display the alias/wrapper name instead of the real system
call name in Linux 3.10 and later kernels.
(anderson@redhat.com)
- Increase the internal hash queue head count from 128 to 32768.
The hash queue is used for gathering and verifying lists, and the
original count of 128 may be overwhelmed if a list is extremely
large. For example, on a 256GB system with 192GB of free pages,
the "kmem -f" command takes hours to complete; with this patch,
the time is reduced to a few minutes. In addition, a new command
line option "--hash <count>" has been added to allow a user to
override the default hash queue head count of 32768.
(anderson@redhat.com)
- Fix for the "kmem -F" display and the "kmem -f <address>" or
"kmem <address>" options. Without the patch, "kmem -F" does
not display the first page in a list of free page blocks on its own
line, but rather at the end of the previous line that shows the area
number, block size, and free_area struct address that the page is
linked to. Due to this error, both "kmem -f <address>" and
"kmem -f address>" would not find the associated page or page block
if it happened to be the first page or page block in the list.
(anderson@redhat.com)
- Created a new feature for the internal do_list() function if it
is necessary to immediately perform a function for each entry in a
list while the list is being traversed. A callback function, and an
option callback data pointer, can be registered in the list_data
structure. The address of each entry in the list along with the
optional callback data pointer will be passed to the callback
function. If desired, the callback function may also dictate that
do_list() should stop the list traversal and return immediately to
its caller.
(anderson@redhat.com)
- Made the "kmem -f <address>" and "kmem <address>" options more
efficient by using the new do_list() callback function feature above
as well as restricting the search to only the NUMA node that contains
the address.
(anderson@redhat.com)
- If the first assembly language instruction in an X86_64 function is
"nopl 0x0(%rax,%rax,1)" or "data32 data32 data32 xchg %ax,%ax",
which are generated when the ftrace facility is configured, the
X86_64 "dis" command will append "[FTRACE NOP]" to the line.
(anderson@redhat.com)
- Correction for the "crash -h" and crash.8 man page documentation of
the "--machdep phys_base=<physical-address>" command line option.
In both places the parameter mistakenly indicated "physbase".
(ptesarik@suse.cz)
- If a host build system does not have /usr/bin/wget installed, and
the crash package is built from a directory that was git-cloned
from github.com/crash-utility/crash.git, the error message has
been clarified to indicate "/usr/bin/wget is required to download
gdb-7.6.tar.gz". Without the patch, the message indicates "tar
(child): gdb-7.6.tar.gz: Cannot open: No such file or directory".
(anderson@redhat.com)
- Updated the ARM64 implementation to support Linux 3.13 and later
kernels that expand to a 42-bit address space when 64K pages are
configured. This is also the first crash version that has been
tested on a live ARM64 system with 4K pages, where it cleanly
make it to the "crash>" prompt. However, it should be noted that
some commands (most notably "bt") still do not work as of yet.
(anderson@redhat.com)
- Document the "--machdep phys_offset=<physical-address>" command
line option for the ARM64 architecture in the crash.8 man page and
the "crash -h" output.
(anderson@redhat.com)
- Fix for KVM dumpfiles created with "virsh dump --memory-only" if
an X86_64 kernel was loaded with a non-zero "phys_base". Without
the patch, the crash session fails with the warning message "WARNING:
cannot read linux_banner string" followed by the fatal error message
"crash: vmlinux and <dumpfile name> do not match!".
(anderson@redhat.com)
- Initial working implementation of the basic ARM64 "bt" command, with
several command options still under development. In-kernel exception
frames are only dumped if the exception handler function is contained
within the symbol boundaries from "__exception_text_start" to
"__exception_text_end"; when ARM64 kdump is eventually implemented,
further exception-related work will be resumed.
(anderson@redhat.com)
- Cleaned up the exception frame displays of 64-bit in-kernel and both
32-bit and 64-bit user-mode exceptions.
(anderson@redhat.com)
- Implemented support for the ARM64 "bt -e" option.
(anderson@redhat.com)
- Implemented support for the ARM64 "bt -l" option.
(anderson@redhat.com)
- Update for the X86_64 "bt -l" option such that it also displays the
available file and line number information for functions indicated as
the "exception RIP" in kernel exception frames. The line number
information will follow the exception frame register dump.
(anderson@redhat.com)
- Fix for the ARM64 virtual-to-physical translation of vmemmap page
structure addresses for kernels configured with 4K pages. Without
the patch, any command that required the contents of a page structure
would fail with a readmem error.
(cldu@marvell.com, anderson@redhat.com)
- Added support for the ARM64 architecture in the extensions/snap.c
extension module. Also fixed the progress percentage display to
correct for systems which have non-zero starting physical addresses.
(anderson@redhat.com)
- Implemented support for the ARM64 "bt -f" and "bt -F[F]" options.
(anderson@redhat.com)
- Increase the ARM64 PTRS_PER_PGD_L2_64K from 1024 to 9182 to account
for the Linux 3.13 increase of the ARM64 virtual address space size
from 39 to 42 bits when 64K pages are configured. Without the patch,
the warning message "WARNING: cannot access vmalloc'd module memory"
is displayed during session initialization.
(anderson@redhat.com)
- Fix for the "vm -p" option on ARM64 so that file-backed pages are
properly translated to the filename and offset. Without the patch,
file-backed pages are erroneously shown as being backed on a swap
device.
(anderson@redhat.com)
- Increment maximum ARM64 physical address from 40 to 48 bits to match
upstream kernel commit 87366d8cf7b3f6dc34633938aa8766e5a390ce33.
(anderson@redhat.com)
- Fix for a segmentation violation generated by the "crash -g vmlinux"
command on ARM64.
(anderson@redhat.com)
- Fix for the ARM64 "vtop <address>" command on kernels configured
with 64K pages if the address argument is located in the kernel
logical memory map region, which uses 512MB hugepage mappings.
Without the patch, the verbose page table walk mistakenly continues
to the PTE level.
(anderson@redhat.com)
- Fix for ARM64 /proc/kcore support. Without the patch, the crash
session fails with the warning message "WARNING: cannot read
linux_banner string" followed by the fatal error message "crash:
vmlinux and <dumpfile name> do not match!". At this point in
time, the kernel requires a patch to the ARM64 kern_addr_valid()
function to properly allow memory to be read from the kernel logical
memory map region.
(anderson@redhat.com)
(04/15/14)
7.0.5-1.fc21
- Fedora Rawhide build: crash-7.0.5-1.fc21 (02/28/14)
http://koji.fedoraproject.org/koji/buildinfo?buildID=501238
7.0.5 - Fix for the "runq -g" option for kernels that are configured with
CONFIG_FAIR_GROUP_SCHED, but not CONFIG_CFS_BANDWIDTH. Without the
patch, the command fails with the message "runq: invalid structure
member offset: cfs_rq_throttled".
(vinayakm.list@gmail.com)
- Add support for Xen PVH guest types introduced in Xen 4.4. Without
the patch, running against a Xen 4.4 hypervisor binary would fail
during session initialization with the error message "crash: invalid
structure member offset: domain_is_hvm". In addition, the PVH guest
type is being registered internally as an HVM guest type, the debug
"help -X ofs" command's display of the domain_domain_flags offset
has been fixed to show it in decimal, and the setting of the internal
dc->domain_flags has been fixed to contain all flags set, not just
the first one found.
(dslutz@verizon.com)
- Fix for the "kmem -S" command on Linux 3.1 and later kernels that are
configured with CONFIG_SLUB. Because the the page structure's inuse
and objects fields used by SLUB were changed from discrete u16 types
to bit-fields within an unsigned int, the display of per-node partial
slab statistics are incorrect. Without the patch, the TOTAL and
ALLOCATED values are incorrectly shown as equal values, and therefore
the FREE value is always zero.
(anderson@redhat.com)
- Fix for the "kmem -S" command for kernels that are configured with
CONFIG_SLUB. Eash per-cpu slab object dump may show incorrect
ALLOCATED and FREE values; and as seen on Linux 3.5 and later
kernels, the TOTAL value and the number of individual objects dumped
may also be incorrect (too small).
(anderson@redhat.com)
- When executing the commands from an input file specified by the
"-i <file>" command line option, or when accepting input from a
file as a set of commands or as a set of command arguments using the
"<" redirection character, unconditionally cease the operation if
CTRL-c is entered. Without the patch, depending upon the command
that was running when the SIGINT was received, the operation may
continue uninterruptibly until the file contents are consumed.
(anderson@redhat.com)
- Enhanced the "bt -F" option such that if "-F" is entered twice,
and if the stack frame contents reference a slab cache object, both
the slab cache name and the stack contents will be displayed within
brackets.
(anderson@redhat.com)
- Enhanced the "rd -S" option such that if "-S" is entered twice,
and if the memory contents reference a slab cache object, both the
slab cache name and the memory contents will be displayed within
brackets.
(anderson@redhat.com)
- Fix for the X86_64 "bt" command to prevent an unwarranted message
indicating "WARNING: possibly bogus exception frame" generated
from a blocked kernel thread that was in the process of exec'ing
a user process via the call_usermodehelper() facility.
(anderson@redhat.com)
- Fix for the X86_64 "bt" command to more correctly determine the
function frame that called into an interrupted function. Without
the patch, the first frame just above an IRQ exception frame
register dump may show an invalid/stale function.
(anderson@redhat.com)
- Fix for the X86_64 "bt" command if a page fault exception was
generated by the invalid contents of the RIP register. Without
the patch, the exception frame register dump is not displayed
above the "page_fault" stack frame; and in a related issue, the
"bt -e" option will not find and display the exception frame.
(anderson@redhat.com)
- When invoking a crash session with a compressed vmlinux file,
make the same host-machine/vmlinux endian verification that is
done with uncompressed vmlinx files.
(anderson@redhat.com)
- Reduce the number of CTRL-c entries required to unconditionally
terminate any manually-entered command from three to one.
(anderson@redhat.com)
- Fix for the X86_64 "bt" command if an async page fault exception
occurred in a KVM guest running a Linux 2.6.38 or later kernel.
Without the patch, the exception frame register dump is not displayed
above the "async_page_fault" stack frame.
(anderson@redhat.com)
(02/14/14)
7.0.4-1.fc21
- Fedora Rawhide build: crash-7.0.4-1.fc21 (12/16/13)
http://koji.fedoraproject.org/koji/buildinfo?buildID=485249
7.0.4 - Fix for the "ps" command's display of per-task RSS and %MEM values
in Linux 2.6.34 and later kernels in which SPLIT_RSS_COUNTING is
enabled. Without the patch, the values are only taken from each
task's mm_struct.rss_stat structure, which may contain stale values
because they may not be synchronized with the RSS values stored
in each per-thread task_struct.rss_stat structure; this may lead
to invalid or slightly low RSS values, and worst-case, the %MEM
value may show garbage percentage values.
(vinayakm.list@gmail.com)
- Addressed a few (harmless) Coverity Scan complaints in diskdump.c:
1579:dead_error_line – Execution cannot reach this expression ""|""
inside statement "fprintf(fp, "%sDUMP_DH_COMP...".
1574:dead_error_line – Execution cannot reach this expression ""|""
inside statement "fprintf(fp, "%sDUMP_HEADER_...".
1571:dead_error_line – Execution cannot reach this expression ""|""
inside statement "fprintf(fp, "%sDUMP_HEADER_...".
(anderson@redhat.com)
- Addressed two warnings when compiling diskdump.c on 32-bit architectures
when the snappy library is built in:
diskdump.c:1046: warning: passing argument 3 of
'snappy_uncompressed_length' from incompatible pointer type
/usr/include/snappy-c.h:120: note: expected ‘size_t *’ but argument
is of type ‘ulong *’
diskdump.c:1056: warning: passing argument 4 of ‘snappy_uncompress’
from incompatible pointer type
/usr/include/snappy-c.h:103: note: expected ‘size_t *’ but argument
is of type ‘ulong *’
(anderson@redhat.com)
- Created a simpler interface with the internal do_list() function.
Currently, if a caller wants to gather the contents of a list into
an array, it must do the following:
(1) call hq_open() so that the list contents will be verified and
saved in the hash queue
(2) call do_list() to store the list in the hash queue and return
the number of entries in the list
(3) allocate a buffer to store the array of entries in the list
(4) pass the allocated buffer to retrieve_list() to be populated
from the hash queue
(5) call hq_close()
With this patch, if the passed-in list_data.flags field has a new
LIST_ALLOCATE bit set, then do_list() will perform steps (1), (3),
(4) and (5) above. The caller can access the allocated array via
a new list_data.list_ptr member, and, when done parsing the list,
the allocated buffer should be returned via FREEBUF(). The only
restriction is that the hash queue cannot be currently in use, or
the do_list() call will fail. It should also be noted that there
are circumstances where it still makes sense that steps (1), (3),
(4) and (5) are performed by the do_list() caller.
(anderson@redhat.com)
- Modified the internal parent_list() function, used by "ps -p", to
utilize the simpler new do_list() functionality described above.
(anderson@redhat.com)
- Modified the internal dump_vmap_area() function, used by the
"kmem -v", "kmem <address>" and "search" commands, to utilize
the simpler new do_list() functionality.
(anderson@redhat.com)
- Modified the internal nr_blockdev_pages() function, used by the
"kmem -i", to utilize the simpler new do_list() functionality.
(anderson@redhat.com)
- Modified the internal get_mount_list() function, used by the
"mount" and "files -d" commands, to utilize the simpler new
do_list() functionality.
(anderson@redhat.com)
- Modified the internal get_kmem_cache_list() function, used by the
"kmem -[sS]", "kmem <address>", "rd -S" and "bf -F" commands,
to utilize the simpler new do_list() functionality.
(anderson@redhat.com)
- Modified the internal show_net_devices_v2() and show_net_devices_v3()
functions, used by the "net" command, to utilize the simpler new
do_list() functionality.
(anderson@redhat.com)
- The "help -r" option has been extended to dump the X86 and X86_64
registers stored in the NT_PRSTATUS notes in netdump ELF, kdump ELF,
and compressed kdump dumpfiles. Without the patch, the option only
supports ELF dumpfiles created by the "virsh dump --memory-only"
facility.
(anderson@redhat.com)
- Modified the "runq -g" display to show the CFS task_group and cfs_rq
addresses; also the current task is also displayed in its CFS or RT
queue with with the notation "[CURRENT]" appended to the task data.
(Anthony.Chen@Teradata.com)
- Additional modification to the "runq -g" to display the task_group
and rt_rq addresses for the RT queues, similar in nature to the CFS
queue changes are done by the patch above. In addition, the CFS
rb_root and RT prio_array addresses are no longer shown given that
they can be determined by looking at the cfs_rq and rt_rq structures
whose addresses are now displayed.
(anderson@redhat.com)
- Modified the behavior of the "mod -t" option when running against
Linux 2.6.18 and earlier kernels such that the hexadecimal value of
the module->license_gplok member is always displayed. Without the
patch, if a module's license_gplok boolean or bitmask value is 0,
it would only be displayed if the module was unsigned.
(anderson@redhat.com, atomlin@redhat.com)
- Modified the internal dump_tasks_in_task_group_rt_rq() and
dump_RT_prio_array() functions, used by the "runq" command,
to utilize the simpler new do_list() functionality.
(anderson@redhat.com)
- Resurrection of the remote analysis capability for use with the
"xen-crashd" daemon running on a Xen Dom0 host, which communicates
with a paused or shutdown DomU guest kernel. The daemon can be
accessed like so:
$ crash localhost:5001,/dev/xenmem vmlinux
(dslutz@verizon.com)
- Prevent the X86_64 "bt" command from using starting RSP and RIP
values taken from the NT_PRSTATUS notes of kdump dumpfiles if the
RSP address is not in the task's kernel stack, or in any of the
relevant per-cpu exception stacks. This can happen when the number
of NT_PRSTATUS notes does not match the number of online cpus.
Without the patch, the command may generate a segmentation violation,
fail with the error message "bt: cannot determine starting stack
pointer", or fail with an error message indicating that the command
"cannot transition" from an exception stack to the previous stack.
(anderson@redhat.com)
- Fix for the X86_64 "bt" command for displaying the backtraces of
active tasks running on the non-crashing cpus in kdump dumpfiles
in which the "crash_nmi_callback" function frame does not appear
in the per-cpu NMI exception stacks. That function frame is
normally used as the starting point for the backtraces of those
tasks, but if it does not exist, the "notify_die" frame will be
used instead. Without the patch, the backtraces of the active
non-crashing tasks are incorrect or incomplete.
(anderson@redhat.com)
- Increment the X86_64 NR_CPUS maximum value from 5120 to 8192
to be able to account for Linux 3.13-rc1 and later kernels
in which CONFIG_MAXSMP has been configured. If that is true,
the CONFIG_NR_CPUS value is overridden, and the kernel NR_CPUS
value will be set to 8192. This will cause the crash session to
fail with the messages "WARNING: kernel-configured NR_CPUS (8192)
greater than compiled-in NR_CPUS (5120)" and "crash: recompile crash
with larger NR_CPUS".
(anderson@redhat.com)
- A crash-7.0.3 fix for the proper determination of the kernel NR_CPUS
configurable for Linux 3.8 and later kernels introduced a regression
in Linux 3.8 and later kernels if:
(1) the kernel is configured with CONFIG_SLAB, and
(2) the sum of the kernel's NR_CPUS and MAX_NUMNODES values exceed
the NR_CPUS value compiled into the crash utility.
Without the patch, the crash session generates a segmentation fault
while it indicates "please wait: gathering kmem slab cache data".
(anderson@redhat.com)
- Update for the extensions/trace.c extension module to support the
ftrace data structure changes introduced in Linux 3.10 kernels.
Without the patch, loading the module with the "extend" command fails
with the error message "extend: trace.so: no commands registered:
shared object unloaded".
(d.hatayama@jp.fujitsu.com)
- Increment the S390 and S390X NR_CPUS maximum value from 64 to 512.
(holzheu@linux.vnet.ibm.com)
- Implemented support for the redesigned per-slab object bookkeeping
that was introduced in Linux 3.13-rc1 for kernels configured with
CONFIG_SLAB. In those kernels, the head page structure associated
with each slab is overloaded to serve as the (now removed) slab data
structure, and the array of kmem_bufctl_t data structures that used
to be appended to each slab data structure for object management has
been replaced by a new freelist stack mechanism. Without the patch,
the crash session would fail during initialization with the error
message "crash: invalid structure member offset: kmem_cache_s_c_num".
It should be noted that this patch has only been tested on 3.13-rc1
kernels, which do not have the modified freelist_idx_t in place as
of yet, so the replacement of integer-sized indexes with byte or
short sized indexes had not been checked in. Furthermore, if this
proposed patch set gets accepted:
[RFC][PATCH 0/3] re-shrink 'struct page' when SLUB is on.
https://lkml.org/lkml/2013/12/11/589
then both CONFIG_SLAB and CONFIG_SLUB support in the crash utility
will be broken yet again.
(anderson@redhat.com)
- In order to facilitate the building of the crash binary with either
or both of the optional LZO or SNAPPY compression libraries, two new
Makefile targets have been added:
$ make lzo
$ make snappy
Without the patch, the CFLAGS.extra and LDFLAGS.extra files must be
created or modified as described in these changelog entries:
https://crash-utility.github.io/crash.changelog.html#LZO
https://crash-utility.github.io/crash.changelog.html#SNAPPY
This patch simply does the work automatically. After having done it
one time, there is no need to use the targets for subsequent builds.
The relevant libraries must pre-exist on the build machine.
(anderson@redhat.com)
- Long overdue update of the README file.
(anderson@redhat.com)
(12/13/13)
7.0.3-1.fc21
- Fedora Rawhide build: crash-7.0.3-1.fc21 (10/20/13)
http://koji.fedoraproject.org/koji/buildinfo?buildID=474614
7.0.3 - Fix for the ARM architecture if the backtrace unwind information
cannot be gathered during session initialization. Without the patch,
the two unwind-related warning messages indicating "WARNING: UNWIND:
failed to gather unwind_table list" and "WARNING: UNWIND: failed to
initialize module unwind tables" are followed by the fatal error
message "crash: cannot hash task_struct entries".
(anderson@redhat.com)
- Fix for the "help -[Dn]" dumpfile information display of the GUID EFI
table in the header of SADUMP dumpfiles. Without the patch, only 33
of the 36 bytes in the table are translated.
(d.hatayama@jp.fujitsu.com)
- Fix for the determination of the kernel NR_CPUS configurable for
Linux 3.8 and later kernels that are configured with CONFIG_SLAB.
Without the patch, the kernel's compiled-in NR_CPUS value was
incorrectly calculated to be the sum of the kernel's NR_CPUS and
MAX_NUMNODES configurables.
(anderson@redhat.com)
- In the next release of makedumpfile, the status field of the
dumpfile header of compressed kdumps will show the compression
type that was utilized. The "help -[Dn]" output has been updated
to display that information.
(anderson@redhat.com)
- For kernels configured with CONFIG_SLAB in which an array_cache
pointer referenced by a kmem_cache structure is invalid, the
individual cache(s) will be marked as invalid. During session
initialization, the message "crash: kmem_cache: <cache-address>:
invalid array_cache pointer" will be displayed, and during runtime,
attempts to access the cache(s) will result in a message indicating
that the cache is "[INVALID/CORRPUTED]". Without the patch, the
message "crash: unable to initialize kmem slab cache subsystem" is
displayed during session initialization, and run-time commands that
attempt to access the kmem slab cache subsystem fail with the error
message "kmem cache slab subsystem not available".
(anderson@redhat.com)
- Fix for the "kmem -[sS] <slab-object-address>" option in Linux 3.6
and later kernels configured with CONFIG_SLAB. Without the patch,
the command fails with the message "kmem: address is not allocated in
slab subsystem: <slab-object-address>. This also causes the
"kmem <slab-object-address>" command to (quietly) fail to determine
that the address is a slab object.
(anderson@redhat.com)
- Fix for the "bt" command if a kernel __init text address is
encountered. Without the patch, and depending upon the reallocation
of the __init text memory, a bogus framesize may be calculated, or
more likely, in a compressed kdump, a warning message indicating
"bt: page excluded: kernel virtual address: <address> type:
gdb_readmem_callback" will be displayed following the frame data.
(anderson@redhat.com)
- Update for determining whether an S390X PTE contains a swap entry
in Linux 3.12 and later kernels.
(holzheu@linux.vnet.ibm.com)
- Resurrected the translation and display of the page.flags bits by the
"kmem -p" command on Linux 2.6.26 and later kernels whose vmlinux
debuginfo data contains either the "pageflags" enumerator or the
"pageflag_names" array of trace_print_flags structures. If they are
not available, just the page.flags value is printed in hexadecimal,
as has been done since Linux 2.4.9.
(anderson@redhat.com)
- Fix for the "bt" command when used with vmcore files that were
created with the recently-introduced "virsh dump --memory-only",
which dumps KVM guests into an ELF vmcore similar to those created
by the kdump facility. Without the patch, a faulty backtrace for the
panic task may be generated due to the use of incorrect starting
RSP/RIP registers; this happens because (unlike kdump) the
non-panicking cpus are offlined prior to the dumpfile being created,
which in turn leads to the use of the wrong NT_PRSTATUS note.
(anderson@redhat.com)
- Fix for the CPU number display on systems with 255 or more cpus
during the initial banner, by the "set" command, the "ps" command,
and by all commands that display the per-task header consisting of
the task address, pid, cpu and command name. Without the patch, for
cpu 255, the "sys" command displays "NO_PROC_ID", and the other
commands would show a "-" for the cpu number; for cpu numbers greater
than 255, garbage values would be displayed in the cpu number field.
(anderson@redhat.com)
- Implemented support for compressed kdump header version 6, in which
makedumpfile(8) adds new fields in the kdump_sub_header to support
large memory systems with pfn values that are larger than 32-bits.
Without the patch, if the system contains physical memory located
in high memory such that its maximum pfn value is overflows the
32-bit "max_mapnr" field in the header, the crash session will fail
with the error message "crash: vmlinux and vmcore do not match!".
(jingbai.ma@hp.com)
- Fix for the "net -s" command on Linux 3.8 and later kernels. Without
the patch, the command fails with the message "net: invalid structure
member offset: inet_opt_daddr".
(anderson@redhat.com)
- Fix a build failure in a native ARM64 environment due to the use of
obsolete LKCD dumpfile headers.
(anderson@redhat.com)
- Implementation of a new "per-cpu object" as an argument format that
can be passed to the "p", "struct", "union" or "*" commands. The
format is expressed as either <per-cpu symbol>:<cpu-specifier> or
as <per-cpu offset>:<cpu-specifier>, where the per-cpu symbol or
per-cpu offset must precede a colon, and where the <cpu-identifier>
follows the colon. The cpu-identifier may be expressed in any of
the following manners:
: CPU of the currently selected task.
:a[ll] all CPUs.
:#[-#][,...] CPU list(s), e.g. "1,3,5", "1-3",
or "1,3,5-7,10".
Without the patch, per-cpu symbols are only accepted by the "p"
command, and the data type and the resolved kernel virtual address
for each per-cpu instance are displayed shown. With this patch, a
colon and a cpu-specifier may be appended to the symbol name, and the
the contents of the symbol on each cpu that is specified will be
displayed by the "p" command. For the "struct/union/*" commands, an
argument may be specified using either a per-cpu offset value or
per-cpu symbol name followed by a colon and cpu-specifier, and the
contents of each structure/union on each specified cpu will be
displayed.
(ptesarik@suse.cz)
- Fixed several minor flaws that were detected by a Coverity Scan:
tools.c:
992:warning[invalidScanfArgType_int] – %d in format string
(no. 1) requires 'int *' but the argument type is 'unsigned
int *'.
memory.c:
7461:error[uninitvar] – Uninitialized variable: page_cache_size
filesys.c:
731:error[resourceLeak] – Resource leak: version
kernel.c:
5675:error[uninitvar] – Uninitialized variable: action
7799:error[memleakOnRealloc] – Common realloc mistake:
'ikconfig_all' nulled but not freed upon failure
configure.c:
793:error[mismatchAllocDealloc] – Mismatching allocation and
deallocation: fp
remote.c:
1120:error[resourceLeak] – Resource leak: pipe
va_server.c:
316:error[memleak] – Memory leak: disk_hdr
va_server_v1.c:
311:error[memleak] – Memory leak: disk_hdr
makedumpfile.c:
80:error[memleakOnRealloc] – Common realloc mistake: 'ptr' nulled
but not freed upon failure
sadump.c:
231:error[memleakOnRealloc] – Common realloc mistake: 'sdh'
nulled but not freed upon failure
extensions/snap.c:
550:error[uninitvar] – Uninitialized variable: prstatus_len
541:error[uninitvar] – Uninitialized variable: l_offset
extensions/trace.c:
1477:error[resourceLeak] – Resource leak: file
(anderson@redhat.com)
(10/25/13)
7.0.2-1.fc21
- Fedora Rawhide build: crash-7.0.2-1.fc21 (09/04/13)
http://koji.fedoraproject.org/koji/buildinfo?buildID=461865
7.0.2 - Added "bison" to the BuildRequires line of the crash.spec file.
Without the patch, the build of the embedded gdb-7.6 module will fail
unless either /usr/bin/bison or /usr/bin/yacc are available. The
failure will result in a stream of error messages from different
files that indicate:
multiple definition of 'main'
undefined reference to 'c_parse_escape'
undefined reference to 'ada_parse'
undefined reference to 'ada_error'
undefined reference to 'c_parse'
undefined reference to 'c_error'
undefined reference to 'cp_demangled_name_to_comp'
undefined reference to 'cp_demangled_name_parse_free'
undefined reference to 'cp_comp_to_string'
undefined reference to 'cp_new_demangle_parse_info'
and the build fails like so:
collect2: ld returned 1 exit status
make[4]: *** [gdb] Error 1
crash build failed
If building with rpmbuild, the new BuildRequires "bison" entry will
prevent the build from initiating unless the bison package has been
installed. If building with the tar.gz file, the build attempt will
proceed and fail unless either the bison or byacc (Berkeley Yacc)
package is installed.
(anderson@redhat.com)
- Fix the S390X initialization sequence on kernels that are configured
with CONFIG_STRICT_DEVMEM to automatically try /proc/kcore if:
(1) the /dev/crash driver is not available, and
(2) the initial /dev/mem access fails.
Without the patch, if /dev/mem is selected as the memory source and
it is restricted, the crash session will fail during initialization
with the error message "crash: read error: kernel virtual address:
<address> type: cpu_possible_mask".
(anderson@redhat.com)
- When checking whether a argument on the crash command line is a
dumpfile that may be in makedumpfile's "flattened" format, do not
bother checking character device files.
(anderson@redhat.com)
- Fix for the PPC64 virtual-to-physical virtual address translation
mechanism for vmalloc and user-space virtual addresses on Linux 3.10
and later kernels. Without the patch, the message "WARNING: cannot
access vmalloc'd module memory" is displayed during initialization,
and during the crash session, if a command attempts to translate or
read a vmalloc or user-space virtual address, it will fail.
(anderson@redhat.com)
- Clean up all files that emit "warning: format not a string literal
and no format arguments" when compiled with -Wformat-security warning
option. All instances of fprintf, sprintf and snprintf using the
format "fprintf(fp, buf)" are replaced with "fprintf(fp, "%s", buf)".
Also, the -Wformat-security warning option has been added to the
option list used when compiling with "make warn".
(stefan.bader@canonical.com, anderson@redhat.com)
- Fix a build failure when compiling with very old gcc-3.4.6 version
on a 2.6.9-based RHEL4 IA64 host. The bfd library in gdb-7.6 is
compiled with the -Werror option, and it fails with the message
"elflink.c:4733: warning: 'idx' might be used uninitialized in this
function".
(anderson@redhat.com)
- Fix a build failure when compiling with very old gcc-3.4.6 version
on a 2.6.9-based RHEL4 S390 or S390X hosts. The embedded gdb-7.6
fails to compile with the error message "s390-nat.c:364: error:
storage size of 'iov' isn't known".
(anderson@redhat.com)
- Fix to properly store two-digit kernel version numbers.
(timo.lindfors@iki.fi)
- Fix to provide hugepage address translation for the "vtop" command on
the PPC64 architecture.
(hbathini@linux.vnet.ibm.com)
- Fix for the "log" command to account for the kernel data structure
name change from "log" to "printk_log" in Linux 3.11-rc4 and later
kernels. Without the patch, the message "WARNING: log buf data
structure(s) have changed" will be displayed during initialization
and by the "log" command.
(anatol.pomozov@gmail.com)
- Fix to add a linefeed after the description of the "kmem -I" option
in the "help kmem" output, which was recently added in crash-7.0.0.
(anderson@redhat.com)
- Document the "-s" command line option in the "crash -h|--help" output
and in the crash.8 man page to also indicate that runtime command
scrolling is turned off by default.
(anderson@redhat.com)
- Fix for the "irq -d" option on 2.6.25 and later X86_64 kernels to
display the Intel interrupt descriptor table contents. Without the
patch, those kernel versions would display "irq: -d option not
supported or applicable on this architecture or kernel".
(anatol.pomozov@gmail.com)
- Fix for the "kmem -[sS]" options on Linux 3.11-rc1 and later kernels
that are configured with CONFIG_SLAB. Without the patch, the command
fails with the error message "kmem: invalid structure member offset:
kmem_cache_s_lists".
(anatol.pomozov@gmail.com)
- Fix for the "kmem <address>" and the "bt -F" options on Linux 3.8
and later kernels that are configured with CONFIG_SLUB. Without the
patch, the command would fail with the error message "kmem: invalid
structure member offset: page_slab".
(anderson@redhat.com)
- Fix misspellings in the "bt" and "search" help page output.
(anatol.pomozov@gmail.com)
- Fix for the determination of the base of the kernel's unity-mapped
virtual address region on recent ARM kernels whose "_stext" variable
address has changed from 0xc0008000 to 0xc0100000. Without the
patch, the crash session fails during invocation with the error
message "crash: vmlinux and vmcore do not match!".
(Jan.Karlsson@sonymobile.com)
- When printing data structures, prevent the embedded gdb from
symbolically translating pointers that are not kernel virtual
addresses. Kernel or module symbols that are not virtual addresses
can be mistaken for virtual addresses, leading to NULL pointers
being invalidly translated into a symbol name from the vmlinux or
module object file. For example, in X86_64 kernels, NULL pointers
are translated into the symbol "irq_stack_union", whose value is
not a virtual address, but rather a per-cpu offset value of 0.
(anderson@redhat.com)
- Fix for the "kmem -s <address>" or "kmem <address>" options on
Linux 3.11 and later kernels configured with CONFIG_SLAB. Without
the patch, both commands fail with the error message "kmem: cannot
resolve cache_cache".
(anderson@redhat.com)
- Fix to prevent the "bt" command from generating a segmentation
violation in a case where the per-cpu "current_task" variable and
the runqueue's "curr" variable did not agree, and the panic task
had overflowed its kernel stack. This led to the selection of the
a starting RSP address which belonged to the other task; without
the patch, the command generated a segmentation violation after
printing the first frame of the backtrace.
(anderson@redhat.com)
(09/04/13)
7.0.1-1.fc20
- Fedora Rawhide build: crash-7.0.1-1.fc20 (06/17/13)
http://koji.fedoraproject.org/koji/buildinfo?buildID=427375
7.0.1 - Fix the -I include path sequence in the extensions/eppic.mk file to
prevent a series of "redefined" and "redeclaration" warnings when
compiling the EPPIC extension module.
(anderson@redhat.com)
- Address two compile-time warnings generated as a result of the
gdb-7.6.patch. Without the patch, there are "warning: no previous
prototype" warnings for gdb_main_entry() and replace_ui_file_FILE().
(anderson@redhat.com)
- Implemented a new "mod -t" option that walks through the installed
modules and checks for non-zero values in each module's "taints"
bitmask, and translates the bits into symbolic letters if possible,
or shows the hexadecimal value of the bitmask if not. In older
kernels, the "license_gplok" field is checked, and if non-zero, its
value is displayed in hexadecimal. Lastly, if the "gpgsig_ok" member
exists and is zero, a "(U)" notation will also be displayed.
(atomlin@redhat.com, anderson@redhat.com)
- Fixed compiler warnings generated by extensions/trace.c when compiled
with -DFORTIFY_SOURCE=2. Without the patch, the messages "warning:
ignoring return value of 'mktemp', declared with attribute
warn_unused_result", "warning: ignoring return value of 'fwrite',
declared with attribute warn_unused_result", and "warning:
'trace_dat' may be used uninitialized in this function" are
generated.
(anderson@redhat.com)
- Laid down the basic infrastructure for the ARM64 backtrace facility
using the kernel's arm64 unwind facility as a basis. Compile-tested
only.
(anderson@redhat.com)
- Implemented the ARM64 virtual-to-physical kernel and user address
translation functions, supporting both 2-level page tables with
64K pages, and 3-level page tables with 4K pages. Also added the
associated PTE translator function. Compile-tested only.
(anderson@redhat.com)
- Implemented the capability of building crash as an x86_64 binary
for analyzing ARM64 dumpfiles on an x86_64 host, which can be done
by entering "make target=ARM64". After the initial build is
complete, subsequent builds can be done by entering "make" alone.
(anderson@redhat.com)
- Added "aarch64" to the ExclusiveArch: line in the crash.spec file.
(anderson@redhat.com)
- Fix for the S390X "bt" command for Linux 3.10 and later kernels.
Without the patch, the starting stack location of the per-cpu async
and panic stacks of active tasks would be incorrectly determined.
(holzheu@linux.vnet.ibm.com)
(06/17/13)
7.0.0 - Updated the embedded gdb version to FSF gdb-7.6, which was officially
released by the Free Software Foundation on http://www.gnu.org on
4/26/13. The primary motivation for upgrading from gdb-7.3.1 is for
future ARM64 support, but there are also issues with respect to
kernels built with gcc-4.8.0. The relevant pieces of gdb-7.3.1.patch
were forward-ported to the gdb-7.6.patch, and the GDB_7_6 #define has
been applied in the top-level sources where appropriate.
(anderson@redhat.com)
- Continued incremental steps for support of the ARM64 architecture.
(anderson@redhat.com)
- Fix for the "struct name.member <address>" option if the "member"
name is also coincidentally a member of an embedded structure that is
located before the targeted member. Without the patch, the value of
the embedded structure's member is displayed instead of the targeted
member.
(qiaonuohan@cn.fujitsu.com)
- Expose a heretofore unadvertised "kmem -[sS] -I slab[,slab]" option
that specifies one or more slab cache names in a comma-separated
list that the "kmem -[sS]" option should ignore. This can be helpful
in cases where a corrupted slab cache may never complete, or in
very large memory systems where one or more caches take an inordinate
amount of time to complete.
(anderson@redhat.com)
- Fix for the "kmem -i" option on Linux 3.9 and later kernels. Without
the patch, the "TOTAL SWAP", "SWAP USED" and "SWAP FREE" lines are
not displayed because the kernel's former "swapper_space" singular
address_space structure has has been changed into a "swapper_spaces"
array of address_space structures, with one for each swap partition.
(anderson@redhat.com)
- Support for the PPC64 BOOK3E processor family, whose virtual memory
layout and PTE format are significantly different. Without the
patch, the crash session fails to initialize properly.
(ataufer@us.ibm.com)
- Fix for the PPC64 "sys", "mach" and initial system banner display of
of the processor speed in more recent kernels. Without the patch,
the "MACHINE" line in the initial banner and in the "sys" command
display may show "MACHINE: ppc64 (unknown Mhz)", and the "mach"
command may show "PROCESSOR SPEED: (unknown)".
(anderson@redhat.com, ataufer@us.ibm.com)
- Since the libgdb.a file no longer exists in gdb-7.6, the Makefile
does not check for it as a determining factor for whether a build
has succeeded.
(anderson@redhat.com)
- gdb-7.6 requires that the bfd library's "config.h" file be #include'd
before the "bfd.h" file by the top-level symbols.c file.
(anderson@redhat.com)
- gdb-7.6 has replaced/moved the gnu_debuglink_crc32() utility function
to bfd_calc_gnu_debuglink_crc32(); the call in symbols.c has been
configured based upon the gdb version.
(anderson@redhat.com)
- gdb-7.6 has reworked its do_cleanups() functionality, which requires
the gdb_error_hook() function to pass all_cleanups() as an argument.
(anderson@redhat.com)
- gdb-7.6 causes the anon_member_offset() function to fail due to a
change in the output string; the function has been changed to work
with both old and new gdb versions.
(anderson@redhat.com)
- gdb-7.6 required changes to vm_stat_init() and vm_event_state_init()
functions because enum lists get displayed differently on the S390X
and PPC64 architectures, which in turn caused failures of "kmem -i",
"kmem -z" and "kmem -V" on those two machine types.
(anderson@redhat.com)
- Adjusted the alignment of the "kmem -V" and "kmem -z" display of the
items in the vm_stat[] array based upon the longest enumerator name
string.
(anderson@redhat.com)
- Adjusted the alignment of the "kmem -V" display of the cumulative
totals of the per-cpu "vm_event_states" items based upon the longest
enumerator name string.
(anderson@redhat.com)
- Modified the top-level Makefile such that if the tar.gz file of the
configured gdb version does not exist in the build directory, try to
wget the file from http://ftp.gnu.org/gnu/gdb. This is normally not
necessary because the most recent gdb tar.gz file is bundled with the
the crash utility tar.gz and src.rpm files. However, it will allow
the use of the gdb-less crash.tar.gz file created via "make tar" to
be copied to another location, or perhaps copied to a git tree, and
then built without containing the the gdb tar.gz file.
(anderson@redhat.com)
- Fix for the s390x.c file to handle a gcc-4.8.0 compiler warning when
building crash with "make warn", or compiler failures when building
with "make Warn" on an S390x machine. Without the patch, gcc-4.8.0
generates the message "error: variable ‘psw_addr’ set but not used
[-Werror=unused-but-set-variable]".
(anderson@redhat.com)
- Fixes for the s390dbf.c file to handle gcc-4.8.0 compiler warnings when
building crash with "make warn", or compiler failures when building
with "make Warn" on an S390X machine. Without the patch, gcc-4.8.0
generates three "error: variable ‘<variable>’ set but not used
[-Werror=unused-but-set-variable]" messages.
(anderson@redhat.com)
- Fix for an X86_64 warning message that gets displayed during session
initialization when running against Linux 3.9 kernels that were
compiled with gcc-4.8.0. Without the patch, the warning message
"crash: cannot determine thread return address" is displayed prior
to the system information.
(anderson@redhat.com)
- Fix for lack of kernel text line number information by the "dis -l"
and "sym <text-symbol or address>" options on Linux 3.9 kernels
that were compiled with gcc-4.8.0. Without the patch, the line
number information for kernel text symbols of type "(T)" may not
be able to be determined and displayed.
(anderson@redhat.com)
(05/10/13)
6.1.6-1.fc20
- Fedora Rawhide build: crash-6.1.6-1.fc20 (04/09/13)
http://koji.fedoraproject.org/koji/buildinfo?buildID=410379
6.1.6 - Fix for a crash-6.1.5 regression that causes the "mount" command
to fail on kernel versions prior to Linux 3.3. Without the patch,
the command fails with the message "mount: invalid structure member
offset: mount_mnt_devname".
The regression was caused by this patch:
- Patch to the internal gdb_get_datatype() function to return the
typecode and length of integer variables.
(adrian.wenl@gmail.com, anderson@redhat.com)
which inadvertently caused the STRUCT_SIZE("mount") macro to return
a false positive for the non-existent "mount" data structure due to
the existence of a "mount" kernel symbol in "security/inode.c".
The patch above was requested as an aid for an extension module,
and had no use with respect to the base crash utility. However, the
patch has the unintended side effect of allowing macros such as
STRUCT_SIZE(), STRUCT_EXISTS(), MEMBER_OFFSET() and MEMBER_EXISTS()
to return false positives. All of the macros call datatype_info(),
a function that expects a data type argument. If the passed-in
data type argument does not exist, but there does happen to be a
kernel variable with the same name, then it used to be rejected
because the internal gdb_get_datatype() function purposefully did
not return the typecode and length of kernel variables. However,
the patch above modified that functionality, and gdb_get_datatype()
started to return the typecode and length of kernel variables.
In order to allow an extension module the capability of utilizing
the internal gdb_get_datatype() function with a kernel variable name
instead of a data type, the patch above has been modified to return
the typecode and length of data variables only if an additional, new,
GNU_VAR_LENGTH_TYPECODE flag is set in the gnu_request.flags field.
As it stands now, that flag is not used by the base crash utility.
Furthermore, as a defensive mechanism against future breakage, the
STRUCT_SIZE() and STRUCT_EXISTS() macros have been modified to pass
a special non-NULL, third argument to datatype_info() that will
enforce the fact that the request is only functional for data type
names. In addition, the MEMBER_OFFSET() and MEMBER_EXISTS() handling
in datatype_info() has been fortified to ensure that the base data
type is in fact a structure or union. It should also be noted that
extension modules that were compiled with the old STRUCT_SIZE() or
STRUCT_EXISTS() macro definitions will still work as they did before.
(anderson@redhat.com)
(04/04/13)
6.1.5 - Fix for the ARM "irq" command. Without the patch, on 2.6.34 and
later kernels configured with CONFIG_SPARSE_IRQ, the command fails
with the error message "irq: cannot determine number of IRQs".
(Jan.Karlsson@sonymobile.com)
- Fix for a segmentation violation generated during invocation while
parsing a makedumpfile-created "flat-format" vmcore-incomplete file.
Without the patch, the crash session would display the error message
"crash: unable to seek dump file vmcore-incomplete", followed by a
segmentation violation.
(anderson@redhat.com)
- Fix for a segmentation violation generated by the "kmem -s" option
when encountering a corrupted array_cache structure that contains
a bogus "avail" count that is greater than the maximum legitimate
limit value. Without the patch, the "kmem -s" command would print
a warning message regarding the invalid array_cache, complete the
command normally, and then generate a segmentation violation when
freeing buffers used by the command.
(anderson@redhat.com)
- Update to the "kmem -s" function to include the errors found in slab
structures to the display of total errors found when the command
completes. Without the patch, invalid list_head pointers, bad
inuse counters, and bad s_mem pointers were not added to the total
number of errors found.
(anderson@redhat.com)
- Fix for "crash --osrelease <dumpfile>" and "crash --log <dumpfile>"
when run on an ARM compressed kdump with a crash binary that was built
with "make target=ARM" on an x86 or x86_64 host. Without the patch,
if the compressed kdump header version is 4 or 5, "crash --osrelease"
fails with the error message "crash: compressed kdump: cannot lseek
dump vmcoreinfo" followed by "unknown", and "crash --log" fails with
the error message "crash: <dumpfile>: no VMCOREINFO section".
(anderson@redhat.com)
- Enhancement to the "swap" command to display the swap_info_struct
address of each configured swap device. The output has been changed
to display the address in the first column, and the variable-length
device name has been moved to the last column.
(anderson@redhat.com)
- Fix for the "kmem -[sS]" options on kernels that configured with both
CONFIG_SLUB and CONFIG_NODES_SHIFT, and that are running on hardware
that generates NUMA nodes that contain no memory. Without the patch,
both command options fail immediately with the message "kmem: invalid
kernel virtual address: 8 type: kmem_cache_node nr_partial".
(anderson@redhat.com)
- Increment the PPC64 NR_CPUS maximum value from 1024 to 2048.
(anderson@redhat.com)
- Strip the ".isra." and ".part." appendages to cloned text symbol
names, which seem to have been introduced by gcc-4.6.0. To keep
them intact, a "--no_strip" command line option has been added.
(anderson@redhat.com)
- Patch to the internal gdb_get_datatype() function to return the
typecode and length of integer variables.
(adrian.wenl@gmail.com, anderson@redhat.com)
- Fix for the "dev -d" option on Linux 3.6 and later kernels. Without
the patch the option fails with the message "dev: invalid structure
member offset: request_queue_rq".
(holzheu@linux.vnet.ibm.com)
- Export the red/black tree utility functions rb_first(), rb_parent(),
rb_right(), rb_left(), rp_next() and rb_last(). Without the patch,
they are statically declared and only used by the "runq" command.
(qiaonuohan@cn.fujitsu.com)
- Implemented a new "timer -r" option that displays the hrtimer queues,
supporting all versions from Linux 2.6.16 to the present.
(qiaonuohan@cn.fujitsu.com, anderson@redhat.com)
- Fix for "kmem -s" on Linux 3.8 and later kernels that are configured
with CONFIG_SLAB. The kmem_cache.array[] length has been extended to
store the nodelist pointers, so the original method to determine the
per-cpu array limit can go out-of-range. Without the patch, during
session initialization there may be a message that indicates "crash:
invalid kernel virtual address: <address> type: array cache limit",
followed by "crash: unable to initialize kmem slab cache subsystem";
if those messages do get shown, then "kmem -s" will subsequently fail
during runtime with the message "kmem: kmem cache slab subsystem not
available".
(qiaonuohan@cn.fujitsu.com)
- Two Xen hypervisor fixes:
(1) Fix console buffer content length calculation:
Function displaying console buffer always assumes its content
length equal to console buffer size. This is not true and
sometimes it sends garbage to the screen. This patch fixes
this issue.
(2) Improve calculation of beginning of virtual address space:
Xen changeset 26447 (x86: re-introduce map_domain_page() et
al) once again altered virtual address space. The current
algorithm calculating its start could not cope with that
change. New version establishes this value on the base of
image start address and is more generic.
(daniel.kiper@oracle.com)
- Fix for the ARM "vtop" command when run on a module address. Without
the patch, the command fails with error message "vtop: ambiguous
address: <module-address> (requires -u or -k)".
(anderson@redhat.com)
- Add the "--active" command line option to the crash(8) man page
and to the "crash [-h|--help]" output.
(anderson@redhat.com)
- Add the "--buildinfo" command line option to the crash(8) man page
and to the "crash [-h|--help]" output.
(anderson@redhat.com)
- Remove the unadvertised and unnecessary "--data_debug" command line
option, given that it is the default setting.
(anderson@redhat.com)
- Remove the unadvertised and obsolete "--no_namelist_gzip" command
line option.
(anderson@redhat.com)
- Add the "-g [namelist]" command line option to the crash(8) man page
and to the "crash [-h|--help]" output.
(anderson@redhat.com)
- Remove the unadvertised and never-implemented "--shadow_page_tables"
command line option.
(anderson@redhat.com)
- Fix for the ARM "vtop" command when run on a user virtual address
of the panic task. Prior to Linux 3.3, the panic task's pgd gets
overwritten with a pgd that identity-maps the whole address space,
and therefore crash loses the capability of translating any user
virtual address into its original physical address.
(mika.westerberg@iki.fi)
- Fix to prevent the ARM linker mapping symbols "$d" and "$a" from
being added to the list of symbols from kernel modules. Without the
patch, the two symbols would only be rejected from the base kernel's
symbol list, but would be added to the symbol list of individual
kernel modules.
(mika.westerberg@iki.fi)
- Fix for the X86_64 "bt" command to recognize that the kernel was
built with CONFIG_FRAME_POINTER on Linux 3.7 and later kernels
that are configured with CONFIG_FUNCTION_TRACER. In those kernels,
the special 4-byte NOP instruction that can be overwritten during
runtime for dynamic ftracing has been moved to the very beginning
of each function, before the function preamble. Without the patch,
the test that checks the function preamble to determine whether
CONFIG_FRAME_POINTER was configured would fail, which could
potentially lead to less reliable backtraces.
(anderson@redhat.com)
(03/28/13)
6.1.4-1.fc19
- Fedora Rawhide build: crash-6.1.4-1.fc19 (02/19/13)
http://koji.fedoraproject.org/koji/buildinfo?buildID=396882
6.1.4 - Fix for a crash-6.1.3 regression with respect to the loading of
extension modules. Because of the change that replaced the obsolete
_init() and _fini() functions with constructor and destructor
functions, extension modules may fail to load when the extension
modules are built with older compiler/linkers. The problem is
due to the continued usage of the -nostartfiles compiler option
regardless whether the extension module has replaced its _init()
function with a constructor function; with older compiler/linkers,
the module may fail to load. The fix predetermines whether an
extension module still uses _init() or if it has been updated to
use a constructor function, and will use the -nostartfiles option
only on older "legacy" modules.
(anderson@redhat.com)
- Implemented a new "list -r" option that can be used with lists
that are linked with list_head structures. When invoked, the
command will traverse the linked list in the reverse order by
using the "prev" pointer instead of "next".
(rabin@rab.in)
- Fix for the "swap" command's FILENAME display. In some kernels
between 2.6.32 and 2.6.38 the swap partition's pathname may not
show the "/dev" filename component.
(anderson@redhat.com)
- Fix for the "swap" command's PCT display, which will display a
a negative percentage value if more than 5368709 swap pages are
in use.
(anderson@redhat.com)
(02/15/13)
6.1.3 - Implemented a new "crash --log dumpfile" option which dumps the
kernel log buffer and exits. A kernel namelist is not required,
but the dumpfile must contain the VMCOREINFO data from the ELF
header of the original /proc/vmcore file that was created by the
kexec/kdump facility. Accordingly, this option supports kdump ELF
vmcores and compressed kdump vmcores created by the makedumpfile
facility, including those that are in makedumpfile's intermediary
"vmcore.flat" format.
(anderson@redhat.com)
- Fixes for the ppc64.c file to handle gcc-4.7.2 compiler warnings when
building crash with "make warn", or compiler failures when building
with "make Warn" on a PPC64 machine. Without the patch, gcc-4.7.2
generates three "error: variable ‘<variable>’ set but not used
[-Werror=unused-but-set-variable]" messages.
(anderson@redhat.com)
- Update the PPC64 architecure's internal storage of the kernel's
MAX_PHYSMEM_BITS value for Linux 3.7 and later kernels, which changed
from 44 to 46 to for 64TB support. Without the patch, there is no
known issue, but the stored value should be correct.
(anderson@redhat.com)
- Fix for the "mount" command's header display to indicate "MOUNT"
instead of "VFSMOUNT" on Linux 3.3 and later kernels because the
the first column contains a mount structure address instead of a
vfsmount structure address. For those later kernels, it is
permissable to enter either the mount structure address, or the
address of the vfsmount structure that is embedded within it, as
an optional argument. The output has also been tightened up so
that the DIRNAME field is not shifted to the right based upon the
DEVNAME field length.
(anderson@redhat.com)
- Fix for the "mount <superblock>" search option on 2.6.32 and later
kernels. Without the patch, it is possible that multiple filesystems
will be displayed.
(anderson@redhat.com)
- Update to the "mount" help page to indicate that a dentry address
may be used as a search option.
(anderson@redhat.com)
- Fix for the "ps -l [pid|task|command]" option to display the
specified tasks sorted with the most recently-run task (the largest
last_run/timestamp) shown first, as is done with the "ps -l" option
with no arguments. Without the patch, the timestamp data gets
displayed in the order of the "[pid|task|command]" arguments.
(anderson@redhat.com)
- Added the "ps" command to the set of supported "foreach" commands,
serving as an alternative manner of passing task-identifying
arguments to the "ps" command. For example, a command such as
"foreach RU ps" can be accomplished without having to pipe normal
"ps" output to "grep RU". All "ps" options are supported from the
"foreach" framework.
(anderson@redhat.com)
- Fix for the "ps -G" restrictor option such that it also takes affect
if the -p, -c, -l, -a, -r or -g options are used. Without the
patch, thread group filtering would only take effect when the default
"ps" command is used without any of the options above.
(anderson@redhat.com)
- Fortify the internal hq_open() function to return FALSE if it is
already open, and have restore_sanity() and restore_ifile_sanity()
call hq_close() unconditionally.
(anderson@redhat.com)
- Added the "extend" command to the set of built-in commands that
support minimal mode. A new MINIMAL flag has been created for
extension modules to set in their command_table_entry.flags field(s)
to signal that a command supports minimal mode. If the crash session
has been invoked with --minimal, then the "extend" command will
require that the module registers at least one command that has
the MINIMAL bit set.
(per.fransson.ml@gmail.com)
- Prevent the "__crc_*" symbols from being added to the the ARM kernel
symbol list.
(per.fransson.ml@gmail.com, rabin@rab.in)
- Prevent the "PRRR" and "NMRR" absolute symbols from being added to
the ARM kernel symbol list. Without the patch, it allows an invalid
set of addresses to pass the check in the in_ksymbol_range() function.
(per.fransson.ml@gmail.com)
- Fix for the ppc.c file to handle a gcc-4.7.2 compiler warning when
building crash with "make warn", or compiler failures when building
with "make Warn" on a PPC machine. Without the patch, gcc-4.7.2
generates the message "error: variable ‘dm’ set but not used
[-Werror=unused-but-set-variable]".
(anderson@redhat.com)
- Workaround for the "crash --osrelease dumpfile" option to be able
to work with malformed ARM compressed kdump headers. ARM compressed
kdumps that indicate header version 3 may contain a malformed
kdump_sub_header structure with offset_vmcoreinfo and size_vmcoreinfo
fields offset by 4 bytes, and the actual vmcoreinfo data is not
preceded by its ELF note header and its "VMCOREINFO" string. This
workaround finds the vmcoreinfo data and patches the stored header's
offset_vmcoreinfo and size_vmcoreinfo values. Without the patch, the
"--osrelease dumpfile" command line option fails with the message
"crash: compressed kdump: cannot lseek dump vmcoreinfo", followed by
"unknown".
(anderson@redhat.com)
- Fix for the "help -n" option on 32-bit compressed kdumps. Without
the patch, the offset_vmcoreinfo, offset_eraseinfo, and offset_note
fields of the kdump_sub_header have their upper 32-bits clipped off
when displayed. However, it should be harmless since the offset
values point into the first few pages of the dumpfile.
(anderson@redhat.com)
- Update of the extensions/echo.c extension module example, and the
"extend" help page, to utilize a constructor function to call the
register_extension() function. The _init() and _fini() functions
have been designated as obsolete for usage by dlopen() and dlclose().
The echo.c example module has been modified to contain echo_init()
and echo_fini() functions marked as __attribute__((constructor)) and
__attribute__((destructor)) respectively.
(anderson@redhat.com)
- Updated extensions/dminfo.c, extensions/snap.c and extensions/trace.c
to replace their _init() and _fini() functions with constructor and
destructor functions.
(anderson@redhat.com)
- Fix for the "bt" command on the PPC64 architecture when running
on Linux 3.7 kernel threads. Without the patch, some kernel threads
may fail to terminate on the final ".ret_from_kernel_thread" frame,
repeating that frame endlessly, because the stack linkage pointer
points back to itself instead of being NULL.
(anderson@redhat.com)
(02/11/13)
6.1.2-1.fc19
- Fedora Rawhide build: crash-6.1.2-1.fc19 (01/08/13)
http://koji.fedoraproject.org/koji/buildinfo?buildID=377171
6.1.2 - Enhancement of the "task" command to display both the task_struct
and the thread_info structures of a task. The -R option accepts
members of either/both structure types.
(anderson@redhat.com)
- Fix for the X86_64 "search" and "rd" commands due to this commit:
http://git.kernel.org/linus/027ef6c87853b0a9df53175063028edb4950d476
Upon any attempt to read a page within the RAM region reserved for
AMD GART on a live system, the Linux 3.7rc1 commit above causes
causes /dev/mem, /proc/kcore and the /dev/crash drivers to spin
forever, leading to a kernel soft lockup. The RAM pages reserved for
GART consist of 2MB large pages whose _PAGE_PRESENT bits are turned
off. Prior to the above commit, a read() attempt on GART RAM would
cause an unresolvable page fault, and would harmlessly return an
EFAULT. The commit above has changed pmd_large() function such that
it now returns TRUE if only _PAGE_PSE bit is set in the PTE, whereas
before it required both _PAGE_PSE and _PAGE_PRESENT. So instead of
just failing the read() system call with an EFAULT, the page fault
handling code now considers it a spurious TLB fault, and the
instruction is retried indefinitely. The crash utility patch stores
the GART physical memory range, and disallows any attempts to read
from it.
(anderson@redhat.com)
- If an EPPIC_GIT_URL environment variable is defined, then the URL
that it points to is used as an alternative to the code.google.com
git source repository for the eppic.so extension module. However,
the alternative site is only accessed if code.google.com can first
be pinged; this patch removes that restriction.
(per.fransson.ml@gmail.com)
- Fix for the "files" command PATH display on kernels configured with
CONFIG_DEVTMPFS, when the vfsmount pointer in an file structure's
"f_path" member does not point to the root vfsmount required for
reconstructing the full file pathname. Without the patch, open files
in /dev directory may be truncated and not show the "/dev" filename
component.
(anderson@redhat.com)
- Enhancement to the "kmem -v" option on 2.6.28 and later kernels that
utilize the "vmap_area_list" list of mapped kernel virtual memory
regions, replacing the usage of the to-be-obsoleted "vmlist" list.
In those kernels, the output of the command will also show each
vmap_area structure address, in addition to its vm_struct address,
memory range, and size.
(anderson@redhat.com)
- Update to the exported do_rbtree() and do_rdtree() functions such
that they will return the number of items found in the targeted tree,
similar in nature to the do_list() function. The two functions have
also been fixed such that the VERBOSE flag is actually recognized,
so that external callers are able to gather the entries in a tree
without having them displayed. The calls to either function may be
enclosed with hq_open() and hq_close() so the that tree entries may
be subsequently gathered by retrieve_list() into a supplied buffer,
as well as to recognize a corrupted list with duplicate entries.
(anderson@redhat.com)
- Fix for the "extend -u" option to prevent the usage of a member of
a free()'d extension_table structure. No command failure occurs,
but rather an inadvertent coding error.
(Jan.Karlsson@sonymobile.com)
- Fix to allow error() to be called during an open_tmpfile() sequence
prior to close_tmpfile() being called. There are no crash functions
that call error() during an open_tmpfile() sequence, but there's no
reason why it cannot be done. Without the patch, the error message
gets displayed on stdout (as expected), but the error message will
also overwrite/corrupt the tmpfile() data while it is being parsed.
(anderson@redhat.com)
- Fix to properly determine whether X86_64 kernels were configured
with CONFIG_FRAME_POINTER, due to this ftrace-related commit:
http://git.kernel.org/linus/d57c5d51a30152f3175d2344cb6395f08bf8ee0c
Without the patch, the crash utility fails to determine whether the
kernel was built with CONFIG_FRAME_POINTER, and therefore the "bt"
command cannot take advantage of it for more reliable backtraces.
(anderson@redhat.com)
- Fix to properly determine whether 2.6.31 and earlier X86_64 kernels
were configured with CONFIG_FRAME_POINTER. Without the patch, the
crash utility may fail to determine whether the kernel was built with
CONFIG_FRAME_POINTER. In those kernel versions -- which may be
dependent upon the compiler version used -- one of the sample
functions tested may have their "push %rbp, mov %rsp,%rbp" function
preamble separated by other instruction(s), resulting in a false
negative that precludes the "bt" command from taking advantage of
framepointers.
(anderson@redhat.com)
- Fix for the file and line-number string that is displayed by the
"sym <kernel-text>" option. Without the patch, the "/usr/src/"
part of the string is stripped, and the filename string itself
could have two corrupted characters in the pathname, for example,
showing "k3.nel-3.6.fc17" instead of "kernel-3.6.fc17". This is
dependent upon the compiler version, or perhaps the string library
that is linked into the crash binary, because it only has been seen
on crash binaries built with gcc-4.7. The fix now displays the full
pathname, no longer dropping the "/usr/src" from beginning.
(anderson@redhat.com)
- Restricted the X86_64 "line_number_hook" to kernels earlier than
2.6.24, i.e., kernels prior to the x86/x86_64 merge. Without the
patch, the manufactured filename information for assembly-language
files was incorrect for 2.6.24 and later kernels. Also, the kernel
debuginfo data now has file/line-number data for assembly-language
files as well, obviating the need for the hook.
(anderson@redhat.com)
- Fix for the extensions/trace.c extension module to prevent a double
free exception that would occur if a calloc() call fails during
module initialization.
(per.fransson.ml@gmail com)
- Fix for the "p -u" option if a 32-bit kernel symbol is incorrectly
passed as an argument. Without the patch, the command fails, but
the next command requiring the services of the embedded gdb module
will generate an error message of the sort "*** glibc detected ***
crash: free(): invalid pointer: <address> ***", or "*** glibc
detected *** crash: munmap_chunk(): invalid pointer: <address> ***",
followed by a backtrace, and an abort of the crash session.
(anderson@redhat.com)
- Fix for the embedded gdb module to correctly handle kernel modules
whose ELF header contains "__ksymtab" and "__ksymtab_gpl" sections
with non-zero (nonsensical) "Address" values, such as those shown
in this example snippet:
$ readelf -a edac_core.so
...
Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
...
[ 8] __ksymtab PROGBITS 0000000000000060 0000ad90
0000000000000010 0000000000000000 A 0 0 16
...
[10] __ksymtab_gpl PROGBITS 0000000000000070 0000add0
00000000000001a0 0000000000000000 A 0 0 16
...
Without the patch, if one of the odd sections above is encountered,
the "Offset" values of the remaining sections are not processed; and
if the module's .data section is ignored, gdb incorrectly calculates
the address of all symbols in the module's .data section, leading to
incorrect output if, for example, data is printed with the gdb "p"
command. This invalid ELF section format was introduced in Linux 3.0
by the kernel's "scripts/module-common.lds" file.
(jan.kratochvil@redhat.com)
- Fix for the "runq -g" option if the kernel contains more than 200
task groups. Without the patch, the command generates a segmentation
violation.
(anderson@redhat.com)
(01/09/13)
6.1.1-1.fc19
- Fedora Rawhide build: crash-6.1.1-1.fc19 (11/27/12)
http://koji.fedoraproject.org/koji/buildinfo?buildID=369327
6.1.1 - Fixes for the ARM "vtop" command display of kernel unity-mapped
virtual addresses. Without the patch, the PGD, PMD values may be
incorrect, and the PAGE value is always incorrectly calculated.
(paawan1982@yahoo.com, rabin@rab.in)
- Fix for Linux 2.6.34 and later kernels that are configured with
CONFIG_SLUB, but not configured with CONFIG_IKCONFIG, to be able
to determine the kernel's CONFIG_NR_CPUS value. Without the patch,
if the actual number of cpus is larger than the crash utility's
per-architecture NR_CPUS maximum value, then the cpus beyond the
NR_CPUS limit would not be accounted for.
(anderson@redhat.com)
- Increment the X86_64 NR_CPUS maximum value from 4096 to 5120.
(anderson@redhat.com)
- Try to determine whether the kernel is running as a virtual machine
by using any available kernel-specific data or by dumpfile type.
The results of the hypervisor type search will be stored in the
internal kernel_table data structure, and if a hypervisor type can
be determined, its name will be displayed by the "mach" command. The
result of the hypervisor determination, successful or otherwise, may
be viewed during session initialization if the -d<number> command
line option is invoked, or during runtime via the "help -k" option.
Only applicable to the X86, X86_64 and IA64 architectures.
(anderson@redhat.com)
- Allow the "ps command" and "foreach name" command options to contain
more than the kernel's maximum of 15 characters that are stored in
each task's task_struct.comm[] array. Without the patch, the two
string arguments were required to be the possibly-truncated command
name string in order to match.
(anderson@redhat.com)
- Enhancement to the "ps" command to allow any of the "command"
arguments to be POSIX extended regular expressions. The expression
string must be encompassed by "'" characters, and will be matched
against the names of all tasks.
(anderson@redhat.com)
- Add support for 2GB pages in the S390X virtual-to-physical address
translation function. Required for the new IBM zEC12 Mainframe.
(holzheu@linux.vnet.ibm.com)
- Initial preparation for support of the ARM64 architecture.
(anderson@redhat.com)
- Fix for the "log" command if a kernel message contains either a
'\n' or a '\t'. Without the patch, the two characters are replaced
with a '.', and the message continues. With the patch applied,
the characters are printed, and if it is a '\n', spaces are inserted
after the linefeed so that the subsequent characters in the message
line up appropriately under the preceding line.
(holzheu@linux.vnet.ibm.com, anderson@redhat.com)
- Fix for the "kmem -[sS]" options on kernels that configured with both
CONFIG_SLUB and CONFIG_NODES_SHIFT, and that are running on hardware
that generates NUMA node ids that are not numbered consecutively.
Without the patch, both command options fail with the error message
"kmem: invalid kernel virtual address: 8 type: kmem_cache_node
nr_partial".
(anderson@redhat.com)
- Fix for the "trace.so" extension module's "trace show" command.
Without the patch, the output showing each trace point is shown
with two hexadecimal virtual addresses instead of displaying them
symbolically using the format "<function> <-- <function>".
(qiaonuohan@cn.fujitsu.com)
- Fixes for handling incomplete/invalid ELF or compressed kdump
vmcores whose per-cpu NT_PRSTATUS notes are missing. For example,
this has been seen to happen when kexec/kdump incorrectly recognizes
a Xen DomU kernel as a Xen Dom0 kernel. Without the patch, possible
ramifications would be a NULL pointer dereference during session
intialization when searching for the panic task, or during the "bt"
command on an active task.
(d.hatayama@jp.fujitsu.com)
- Implemented a new "runq -g" option that displays CFS runqueue tasks
hierarchically by task_group. Tasks in throttled groups are also
displayed. The "runq" command with no option will no longer display
task_group data for the RT queue.
(zhangyanfei@cn.fujitsu.com)
- Patchset for Xen support up to version 4.2:
(1) Fix page tables caching issues,
(2) Use init_tss array or per_cpu__init_tss,
(3) Use per_cpu__crash_notes or crash_notes array,
(4) Try hard to get max_cpus value,
(5) Use cpu_present_map instead of cpu_online_map.
(daniel.kiper@oracle.com)
- Fix for the S390X virtual-to-physical address translation to allow
the HW Change-bit override bit (0x100) to be used in page table
entries.
(holzheu@linux.vnet.ibm.com)
- Fix for a rarely-seen circumstance in which a kdump ELF vmcore of
a Xen dom0 kernel gets incorrectly identified as a old-style netdump
ELF vmcore. This has only been seen after the original kdump ELF
vmcore was transformed via "makedumpfile -d1". Without the patch,
the crash session fails during initialization with the messages
"crash: invalid size request: 0 type: xen kdump p2m mfn page",
followed by "crash: cannot read xen kdump p2m mfn page". If run
against the Xen hypervisor, the session fails during initialization
with the error message "crash: read error: kernel virtual address:
<address> type: crashing_cpu".
(anderson@redhat.com)
(11/20/12)
6.1.0-1.fc17
- Fedora 17 build: crash-6.1.0-1.fc17 (10/25/12)
http://koji.fedoraproject.org/koji/buildinfo?buildID=362456
6.1.0-1.fc19
- Fedora Rawhide build: crash-6.1.0-1.fc19 (10/1/12)
http://koji.fedoraproject.org/koji/buildinfo?buildID=357655
6.1.0 - Fix for 32-bit SADUMP dumpfiles to correctly check whether a
requested physical address is within the 0-640K backup region.
Without the patch, requested physical addresses that are larger
than 32-bits are truncated to 32-bit values, leading to unexpected
results.
(d.hatayama@jp.fujitsu.com)
- Added support for the ELF dumpfile type that is generated by the new
"virsh dump --memory-only" option. The "--memory-only" option uses
a new "dump-guest-memory" QEMU monitor command that creates an ELF
kdump vmcore clone. The "virsh dump" command continues to borrow
the "migrate" QEMU monitor command to create a file that is designed
for guest migration, and not well-suited for a vmcore because it is
not designed for random-access of physical memory. A new "help -r"
option has been added to dump the registers that are stored in
per-cpu "QEMU" ELF notes; those notes are used to distinguish this
dumpfile type from regular kdump ELF vmcores. The patch also
combines common functionality between the new format and the SADUMP
format.
(qiaonuohan@cn.fujitsu.com, anderson@redhat.com)
- Fix for the "runq" command for kernels that have the CFS scheduler.
Without the patch, a cpu's RT runqueue may incorrectly display
"[no tasks queued]" when in fact there are tasks on its queue.
(anderson@redhat.com)
- In the highly-unlikely event that a pre-Linux 3.5 kernel's log buffer
cannot be read during initialization, display a message indicating
"WARNING: cannot read log_buf contents", and just continue. Without
the patch, a "readmem" error would be displayed and the crash session
would be killed.
(anderson@redhat.com)
- Updated the "net -a" option to support Linux 2.6.9 to 3.6.0. Without
the patch, the option displayed "net: -a option not supported or
applicable on this architecture or kernel".
(makc@redhat.com)
- Enhanced the "net -a" option to show the struct neighbour address
associated with each line of output.
(makc@redhat.com)
- Fix for the "runq" command for kernels that are configured with
CONFIG_RT_GROUP_SCHED. Without the patch, tasks contained within
an RT group scheduling entity are not displayed.
(zhangyanfei@cn.fujitsu.com)
- Fix for "crash --version" or "crash -v" to prevent the sourcing
of a .gdbinit file that is located in the current directory.
(anderson@redhat.com)
- Preemptive fix to handle this patch to the x86 devmem_is_allowed()
function that was posted on the Linux Kernel Mailing List here:
https://lkml.org/lkml/2012/8/28/357
If the proposed kernel patch put into place, a failed attempt to
use /dev/mem when the kernel is configured with CONFIG_STRICT_DEVMEM
will not result in an automatic attempt to use /proc/kcore. With
this crash utility patch, the automatic switch to /proc/kcore will
be attempted regardless whether the kernel patch is accepted or not.
(anderson@redhat.com)
- Patch for CVE-2012-3509: libiberty: objalloc_alloc integer overflows
(fw@deneb.enyo.de)
- Fix for Linux 3.0 and later kernels that have been configured with
CONFIG_SLAB and without CONFIG_NODES_SHIFT. Without the patch, the
warning messages "crash: nr_node_ids: symbol does not exist" and
"crash: unable to initialize kmem slab cache subsystem" are displayed
during initialization, and the "kmem -[sS]" options fail with the
message "kmem: kmem cache slab subsystem not available".
(per.fransson@ml.gmail.com, anderson@redhat.com)
- Allow the build procedure to use an alternate compiler by passing
"make CC=<compiler>" to the top-level Makefile.
(anderson@redhat.com)
- Allow the user to append options to the "configure" script that is
invoked by the initial embedded gdb build procedure. The additional
options should be put in a file named "GDBFLAGS.extra" located in the
top-level directory.
(anderson@redhat.com)
- Change for the "ps" command if a task is stopped due to the task
being traced by another task. Without the patch, the traced task
is shown with the "ST" (stopped) status; with the patch it will be
shown with a "TR" (traced) status.
(anderson@redhat.com, qiaonuohan@cn.fujitsu.com)
- The "TR" state has been added to the "foreach" command's list of task
state qualifiers. Without the patch, there is no way to filter out
tasks that are stopped due to being traced by another task.
(anderson@redhat.com)
- Fix for passing a a "gdb" command to a crash session via a pipe
if there are any spaces preceding the "gdb" command name in the
string. Without the patch, the command will fail with the error
message "gdb: gdb request failed: <truncated input-string>".
(qiaonuohan@cn.fujitsu.com)
- Preparation for the future S390/S390X structure name change from
"_lowcore" to "lowcore". The patch checks which structure is defined
and uses the correct name.
(holzheu@linux.vnet.ibm.com)
- Replaced datatype_info() calls in do_radix_tree() and do_rdtree()
with preferred MEMBER_SIZE() macro.
(anderson@redhat.com)
(09/28/12)
6.0.9-1.fc19
- Fedora Rawhide build: crash-6.0.9-1.fc19 (08/21/12)
http://koji.fedoraproject.org/koji/buildinfo?buildID=349682
6.0.9 - Fix for building on host machines that have glibc-2.15.90 installed,
in which case the glibc header file /usr/include/bits/siginfo.h no
longer declares a "struct siginfo", but only the "siginfo_t" typedef.
Without the patch, the build of the embedded gdb module fails with
the error message "linux-nat.h:63:18: error: field 'siginfo' has
incomplete type".
(anderson@redhat.com)
- Add support for reading compressed kdump dumpfiles that were
compressed by the snappy compressor. This feature is disabled by
default. To enable this feature, build the crash utility in the
following manner:
(1) Install the snappy libraries by using the host system's package
manager or by directly downloading libraries from author's
website. The packages required are:
- snappy
- snappy-devel
The author's website is: http://code.google.com/p/snappy
(2) Create a CFLAGS.extra file and an LDFLAGS.extra file in top-level
crash sources directory:
- enter -DSNAPPY in the CFLAGS.extra file
- enter -lsnappy in the LDFLAGS.extra file
(3) Build crash with "make" as always.
(d.hatayama@jp.fujitsu.com)
- Prevent the "ptov" command from returning an invalid virtual address
on 32-bit architectures. Without the patch, the command may result
in an invalid virtual address if the physical address entered cannot
be accessed by a unity-mapped kernel virtual address. The patch
verifies that the calculated virtual address can be translated back
into the supplied physical address.
(Jan.Karlsson@sonymobile.com, anderson@redhat.com)
- Fix to automatically try /proc/kcore as an alternative live memory
source when the /dev/crash driver does not exist and /dev/mem is
unusable because the kernel was configured with CONFIG_STRICT_DEVMEM.
Without the patch, the automatic switch from /dev/mem to /proc/kcore
is only attempted on the X86 and X86_64 architectures.
(anderson@redhat.com)
- Added missing linefeeds to several error messages in makedumpfile.c.
(anderson@redhat.com)
- Fix for a regression introduced by a crash-5.1.1 patch that reworked
the handling of "set" commands that are put in .crashrc files, such
that only certain command options would get resolved before the crash
session is initialized. Without this patch, the "--less", "--more",
"--no_scroll" and "--CRASHPAGER" crash command line options do not
properly override conflicting "set scroll <option>" entries that
are put in a .crashrc file.
(anderson@redhat.com)
- Added new "--hex" and "--dec" crash command line options, which will
set the command output format to hexadecimal or decimal. These two
command line options will override any "set radix [10|16]" settings
in a .crashrc file; since decimal is the default, the "--dec" option
would only be necessary to override a "set radix 16" setting in a
.crashrc file.
(anderson@redhat.com)
- Fix for the "runq" and "timer" commands when running against 2.6.34
and later kernels that are not configured with CONFIG_SMP. Without
the patch, the "runq" command fails with the error message "runq:
per-cpu runqueues does not exist", and the "timer" command fails
with the error message "timer: zero-size memory allocation! (called
from <address>)".
(anderson@redhat.com)
- If code.google.com is not available from the host build machine, then
"make extensions" will be delayed by a 10 minute timeout of the
"git clone" command that downloads the EPPIC library and extension
module source tree. The patch pings code.google.com first in order
to determine its availability before attempting the download.
(anderson@redhat.com)
- For kernel versions 3.5 and later, in which the kernel log buffer has
been converted from a byte-buffer to a variable-length record buffer,
the "log -m" option will display the level in hexadecimal, and
depending upon the kernel version, the value also contains either the
facility or flags bits.
(anderson@redhat.com)
- Fix for accessing the per-cpu registers from ARM vmcores generated
by recent kernels in which the per-cpu data region has been moved
into mapped kernel virtual address space. Without the patch, an
incorrect physical address is calculated, resulting in bogus register
contents.
(Jan.Karlsson@sonymobile.com)
- Check that an s390x dumpfile is a "live dump" earlier during session
initialization so that the internal LIVE_DUMP flag will get set when
"crash --minimal" is invoked.
(holzheu@linux.vnet.ibm.com)
- Removed the usage of C++ keywords in structure and structure member
names declared in "defs.h" so that extension modules written in C++
will compile successfully. Accordingly, the "struct namespace" is
renamed to "struct symbol_namespace", the struct symbol_table_data's
"namespace" member is renamed to "kernel_namespace", and the struct
gnu_request's "typename" member is renamed to "type_name".
(anderson@redhat.com)
- Fix for the date displayed by the initial system banner and by the
"sys" command for Linux version 3.6 and later. Without the patch,
the date displayed will be that of the UNIX epoch, i.e., midnight,
Jan 1, 1970 UTC, adjusted to local time.
(anderson@redhat.com)
- When the eppic.so extension module is built by "make extensions", the
EPPIC source tree is downloaded from its upstream source repository
at https://code.google.com/p/eppic. However, if an EPPIC_GIT_URL
environment variable is defined, then the URL that it points to will
be used as an alternative git source repository.
(per.fransson.ml@gmail.com)
- Fix for a segmentation violation generated by the "struct" command
when printing a structure member using the "struct_name.member"
argument format, where the member is a "char *" that points to a
string that contains a "%" character.
(bob.montgomery@hp.com, adk@acunu.com)
- Patchset to support the most recent Xen hypervisor and Xen pvops
kernels:
(1) Always calculate max_cpus value
(2) Read only crash notes for onlined CPUs
(3) Read variables from dynamically allocated per_cpu data
(4) Get idle data from alternative source
(5) Read data correctly from dynamically allocated console ring
(6) Add support for 3 level P2M tree
(daniel.kiper@oracle.com)
- Fix for building a 32-bit eppic.so extension module after having
built crash with "make target=ARM" or "make target=X86" on an x86_64
host. Without the patch, the eppic.so extension module would be
built as a 64-bit binary.
(per.fransson.ml@gmail.com, anderson@redhat.com)
- For the ARM architecture, fix the determination of the kernel modules
base address when modules are not installed, and update the "mach"
command to display the "KERNEL MODULES BASE" address.
(mika.westerberg@iki.fi, anderson@redhat.com)
- Fix for the "kmem -[sS]" commands for Linux version 3.6 and later
kernels configured with CONFIG_SLUB. Without the patch, the commands
fail with the error message "kmem: invalid structure member offset:
kmem_cache_objsize".
(anderson@redhat.com)
- Fix for an invocation failure when running against Linux version 3.6
and later kernels that are configured with CONFIG_SLAB. Without the
patch, the crash session fails during initialization with the error
message "crash: invalid structure member offset: kmem_cache_s_next".
(anderson@redhat.com)
- Fix for the "kmem -[sS]" commands on kernels that are configured with
CONFIG_SLUB to prevent a silent hang if a per-node slab cache partial
list recurses back onto itself. Without the patch, it was necessary
to kill the command; with the patch an error message is displayed and
the command continues on to the next kmem slab cache.
(anderson@redhat.com)
- Fix for the "kmem -[sS]" and "kmem -s list" options on dumpfiles from
kernels that are configured with CONFIG_SLUB which have been filtered
by the makedumpfile facility. Without the patch, it is possible that
those commands may generate the error message "kmem: page excluded:
kernel virtual address: <address> type: kmem_cache buffer", and
would require either the "--zero_excluded" command line option or
having to execute "set zero_excluded on" during runtime in order to
complete successfully.
(anderson@redhat.com)
(08/17/12)
6.0.8-1.fc18
- Fedora Rawhide build: crash-6.0.8-1.fc18 (07/02/12)
http://koji.fedoraproject.org/koji/buildinfo?buildID=328759
- Fix for building on host machines that have glibc-2.15.90 installed,
in which case the glibc header file /usr/include/bits/siginfo.h no
longer declares a "struct siginfo", but only the "siginfo_t" typedef.
Without the patch, the build of the embedded gdb module fails with
the error message "linux-nat.h:63:18: error: field 'siginfo' has
incomplete type".
(anderson@redhat.com)
6.0.8 - Introduction of a new "tree" command that can be used to dump the
the addresses of all data structure entries in a red-black tree or
a radix tree. Similar in nature to the "list" command, each data
structure in a tree can be dumped in total, or one or more members
in each strucure may be dumped.
(qiaonuohan@cn.fujitsu.com, anderson@redhat.com)
- If a compressed kdump header contains an invalid "nr_cpus" value,
allow the crash session to continue after printing a warning
message. Without the patch, on non-S390/S390X systems, an invalid
nr_cpus value generates a message such as "crash: compressed kdump:
invalid nr_cpus value: 0", and the session subsequently fails with
the message "crash: vmcore: not a supported file format". However,
compressed kdumps have been seen that have an nr_cpus value of 0,
but the session can still run normally. The patch changes the
message to "WARNING: compressed kdump: invalid nr_cpus value: 0",
and the session is allowed to continue.
(anderson@redhat.com)
- Clarify the "help -n" output for compressed kdumps to show the
offsets and sizes of the vmcoreinfo, notes, and eraseinfo sections
in both hexadecimal and decimal, and to cleanly handle compressed
kdumps that have no NR_PRSTATUS notes in the notes section.
(anderson@redhat.com)
- Fix for the X86 "bt" command for a possible situation where the
crashing cpu's back trace starts at the "sysrq_handle_crash" stack
frame instead of farther down the stack below the exception at the
"crash_kexec" stack frame.
(anderson@redhat.com)
- Fix for the "runq" command for kernels that have the CFS scheduler.
Without the patch, tasks queued on a priority array of a cpu's RT
runqueue may not be displayed.
(anderson@redhat.com)
- Fix for analyzing dumpfiles from kernel version 3.5 and later, in
which the kernel log buffer has been converted from a byte-buffer to
a variable-length record buffer. Without the patch, the crash
session fails during initialization with the error message "crash:
cannot determine length of symbol: log_end". If the session is run
on a live system, or if the session is invoked with the "-s" command
line option, the session is not killed, but in those cases the "sys"
and "log" commands will fail with the same error message.
(anderson@redhat.com)
- For kernel versions 3.5 and later, in which the kernel log buffer has
been converted from a byte-buffer to a variable-length record buffer,
two new options have been added. The "log -t" option will display
log messages without the timestamp prepended. The "log -d" option
will display the dictionary of key/value pair properties that the
kernel's dev_printk() function optionally appends to a message.
(anderson@redhat.com)
- The SIAL extension module has been replaced by the "eppic" facility,
which stands for "Embeddable Pre-Processor and Interpreter for C".
The eppic git tree is located at http://code.google.com/p/eppic.
When "make extensions" is done, the eppic source code will be
downloaded automatically via "git clone", and then the "eppic.so"
extension module will be built. The "eppic.so" extension module
offers the same command set as the older "sial.so" module; the SIAL
extension module source files have been completely removed. If
desired, the eppic sources can be updated by executing "git pull"
from the "extensions/eppic" subdirectory.
(lchouinard@s2sys.com)
- Added a new "list -h" option. When used with -h, the "start",
address must be the address of a data structure that contains
an embedded list_head structure. Updated the "list" help page
to more clearly differentiate the difference between using a
"start" address alone, "-H start", or "-h start", and added a
WARNING section to address the problem of "-h start" passing
through an external LIST_HEAD(), or passing though the actual
starting point of the list that is contained within a different
type of data structure from all the entries in the list.
(ptesarik@suse.cz, qiaonuohan@cn.fujitsu.com, anderson@redhat.com)
- Implemented a new "scope" crash environment variable that can alter
the text scope for viewing the definition of data structures. It is
useful in cases where the kernel defines more than one instance of
of a data structure with the same name, and the "wrong" one is
selected by default. The variable takes a kernel or module text
symbol name or address, or an expression evaluating to the same.
If the variable is a module text address, then the command will
attempt to load the module into the crash session if it is not
already loaded; if that fails, then the setting of the variable
will fail.
(anderson@redhat.com)
- Update to the extensions/trace.c extension module to handle a kernel
version 3.4 patch that added a new "ring_buffer_per_cpu.nr_pages"
member, making the trace buffer size per-cpu.
(rabin@rab.in, laijs@cn.fujitsu.com)
- Fix to recognize a kernel version 3.5 patch that changed the
"qstr.len" member from an unsigned integer into a member of an
anonymous structure within an anonymous union. Without the patch,
the following commands fail, displaying the following error messages:
mount: "mount: invalid structure member offset: qstr_len"
files: "files: invalid structure member offset: qstr_len"
vm: "vm: invalid structure member offset: qstr_len"
swap: "swap: invalid structure member offset: qstr_len
fuser: "files: invalid structure member offset: qstr_len"
The "fuser" command generates the above error because it uses the
"files" command behind the scenes.
(anderson@redhat.com)
- Fix for the function that gathers a cpu's register set from an
NT_PRSTATUS note of an x86 or x86_64 compressed kdump header if one
or more cpus were offline when the system crashed. In that case,
if the requested cpu number is equal or greater than the number of
online cpus, the function will fail. When that happens, that cpu's
back trace will not have those registers as a fall-back option if the
starting point cannot be determined otherwise.
(anderson@redhat.com)
- Added "ipcs" and "tree" command references to the crash.8 man page.
(anderson@redhat.com)
- Redefined the usage of the "struct -o" flag when used in conjunction
with a symbol or address argument. Without this patch, the behavior
has been to print the warning message "struct: -o option not valid
with an address argument", ignore the "-o", and to just display the
structure at that address. With this patch, each structure member
will be proceded by its virtual address.
(anderson@redhat.com)
- Added new "bt -s [-xd]" options that will display symbol names plus
their offset in each frame. The default behavior is unchanged, where
only the symbol name is displayed. The symbol offset will be
expressed in the default output format, which can be overridden with
the -x or -d options.
(anderson@redhat.com)
- Fix for 32-bit PPC to handle a situation where one or more
NT_PRSTATUS note(s) were not captured in the kdump header due
to cpu(s) not responding to an IPI. Without the patch, the "bt"
command may result in a segmentation violation.
(nakayama.ts@ncos.nec.co.jp)
- Fix for building the PPC64 architecture in ppc64 environments where
where applications are built 32-bit by default when -m32 or -m64 are
not specified. This was a regression introduced in the crash-6.0.3
patch that introduced the "make target=PPC" feature that can be
performed on ppc64 hosts. Without the patch, a "make" command would
build a 32-bit PPC crash utility on such ppc64 hosts.
(anderson@redhat.com)
- Fix for the 32-bit PPC "irq" command. Without the patch, depending
upon the kernel version, the command would fail with the message
"irq: cannot determine number of IRQs", or "irq: invalid structure
size: irqdesc".
(nakayama.ts@ncos.nec.co.jp)
- Fix for the 32-bit PPC "pte" command to properly translate the PTE
bit settings based upon the correct Book3E specifications.
(nakayama.ts@ncos.nec.co.jp)
(06/29/12)
6.0.7 - Enhanced the "search" command to allow the searched-for value
to be entered as a crash (expression) or a kernel symbol name.
The resultant value of an (expression) or kernel symbol value
must fit into in the designated value size if -w or -h are used,
and neither variant may be used with the -c option. If found,
both the resultant value and the argument input string will be
displayed next to the target address(es).
(anderson@redhat.com)
- Added a new "search -t" option that will restrict the search
to the kernel stack pages of all tasks. If one or more matches
are found in a task's kernel stack, the output is preceded
with a task-identifying header.
(anderson@redhat.com)
- Fix for the s390x "bt -[tT]" options when run on an active task
on a live system. Without the patch, the options fail with the
message "bt: invalid/stale stack pointer for this task: 0".
(anderson@redhat.com)
- Fix for s390x "vm -p" option, which may show invalid user to
physical address translation data if a page is not mapped.
Without the patch, a page's translation may indicate
"<address> SWAP: (unknown swap location) OFFSET: 0",
or show an incorrect swap offset on an actual swap device.
(anderson@redhat.com)
- Added new "vm -[xd]" options to be used in conjunction with
"vm -[mv]", which override the current default output format
with hexadecimal or decimal format for just the command instance.
Without the patch, it would require changing the default output
format with "hex" or "dec" prior to executing "vm -[mv]". The
new flags may also be used with "foreach vm -[mv]".
(anderson@redhat.com)
- Fix for the s390x "vm -p" and "vtop -u <user-address>" commands
if the page containing the relevant PTE is not mapped. Without
the patch, the commands fail with the error message "vm: read error:
kernel virtual address: 0 type: entry" or "vtop: read error: kernel
virtual address: 0 type: entry"
(holzheu@linux.vnet.ibm.com)
- Fix for the s390x "vm -p" command and "vtop -u <user-address>"
commands to properly translate pages that are swapped out into their
swap file and offset. Without the patch, the swap file and offset
would not be displayed.
(holzheu@linux.vnet.ibm.com)
- Added new "list -[xd]" options to be used in conjunction with
"list -s", which override the current default output format
with hexadecimal or decimal format for just the command instance.
Without the patch, it would require changing the default output
format with "hex" or "dec" prior to executing "list -s".
(anderson@redhat.com)
- Added new "net -[xd]" options to be used in conjunction with
"net -S", which override the current default output format
with hexadecimal or decimal format for just the command instance.
Without the patch, it would require changing the default output
format with "hex" or "dec" prior to executing "net -S". The new
flags may also be used with "foreach net -S".
(anderson@redhat.com)
- Added new "mach -[xd]" options to be used in conjunction with
"mach -c", which override the current default output format
with hexadecimal or decimal format for just the command instance.
Without the patch, it would require changing the default output
format with "hex" or "dec" prior to executing "mach -c".
(anderson@redhat.com)
- If the value read from the cpu online, present, or possible masks
contains a cpu bit value that is outside the architecture's maximum
NR_CPUS value, print a warning message during invocation. Without
the patch, a corrupt vmcore containing a bogus mask value could
quietly corrupt heap memory.
(per.fransson.ml@gmail.com)
- Add support to for reading dumpfiles compressed by LZO using
makedumpfile version 1.4.4 or later. This feature is disabled by
default. To enable this feature, build the crash utility in the
following manner:
(1) Install the LZO libraries by using the host system's package
manager or by directly downloading libraries from author's
website. The packages required are:
- lzo
- lzo-minilzo
- lzo-devel
The author's website is: http://www.oberhumer.com/opensource/lzo
(2) Create a CFLAGS.extra file and an LDFLAGS.extra file in top-level
crash sources directory:
- enter -DLZO in the CFLAGS.extra file
- enter -llzo2 in the LDFLAGS.extra file.
(3) Build crash with "make" as always.
(d.hatayama@jp.fujitsu.com)
- Fix for the included "trace" extension module. Without the patch,
if the module initialization sequence fails, a double-free in the
module may lead to a subsequent malloc() segmentation violation
in the crash session.
(per.fransson.ml@gmail.com)
- Incorporated the "ipcs" extension module written by Qiao Nuohan
as a built-in command. The command displays the kernel's usage
of the System V shared memory, semaphore and message queue IPC
facilities. It differs from the original extension module by
fixing a failure scenario if the current task is exiting, and
adds a "-n pid|task" option, which displays the IPCS facilities
with respect to the namespace of a given pid or task.
(qiaonuohan@cn.fujitsu.com, anderson@redhat.com)
- Fix for a gdb-7.3.1 regression that causes the line number capability
to fail with certain ranges of x86 base kernel text addresses.
Without the patch, the "dis -l <symbol>" or "sym <symbol>"
commands would fail to show line number information for certain
ranges of base kernel text addresses.
(anderson@redhat.com)
- Added a new "printm" command to the embedded gdb module. It
is currently only used by the "pstruct" extension module, but
can be used to dump the type, size, offset, bitpos and bitsize
values of an expression.
(qiaonuohan@cn.fujitsu.com)
- Added a new "runq -t" option that displays the timestamp information
of each cpu's runqueue, which consists of either the rq.clock, the
rq.most_recent_timestamp or rq.timestamp_last_tick value, whichever
applies. Following each cpu timestamp is the last_run or timestamp
value of the active task on that cpu, whichever applies, along with
the task identification.
(weijg.fnst@cn.fujitsu.com, anderson@redhat.com)
- Fix for an initialization-time warning when running on a live system
with the most recent version of the modprobe command, which no longer
supports the -l and --type options. The modprobe is used to detect
whether the crash.ko memory driver is part of the distribution.
Without the patch, a warning message is issued that indicates
"/sbin/modprobe: invalid option -- 'l'". If the driver is built into
the kernel, the message is harmless. If the driver is not built into
kernel, then the crash.ko (/dev/crash) driver would not be selected
as the live memory source.
(anderson@redhat.com)
(05/30/12)
6.0.6-1.fc18
- Fedora Rawhide build: crash-6.0.6-1.fc18 (04/30/12)
http://koji.fedoraproject.org/koji/buildinfo?buildID=306178
6.0.6 - Extend the supported cross-architecture build capability so that
it applies to the SIAL extension module. Without the patch,
when building the SIAL module in an environment where the
overlying crash utility was built with "make target=ARM",
"make target=PPC", or "make target=X86", the SIAL extension
module would continue to be built for the host architecture.
(rabin@rab.in, anderson@redhat.com)
- Fixes for memory leaks and possible segmentation violations
when unloading SIAL extension module scripts.
(rabin@rab.in)
- Fix for the new "foreach RU" task state qualifier. Without the
patch, the runnable tasks are not selected.
(anderson@redhat.com)
- Fix to disallow multiple task states from being entered using
the "foreach <task-state>" qualifier. Without the patch, if
multiple states were entered, the last one on the command line
would be honored.
(anderson@redhat.com)
- Fix for the "extend" command to allow the usage of 32-bit PPC
extension modules. Without the patch, the command fails with the
message: "extend: <object>.so: not an ELF format object file".
(anderson@redhat.com)
- If an input line starts with "#" or "//", then the line will be
saved as a comment that is visible when re-cycling through the
command history list.
(olivier.daudel@u-paris10.fr, anderson@redhat.com)
- Fix for a crash-5.1.9 regression that broke the "bt -g" option.
Without the patch, the option is ignored completely.
(ptesarik@suse.cz)
- Fix for s390x virtual-to-physical translation of virtual addresses
that are backed by 1MB pages.
(ptesarik@suse.cz)
- The s390x has a dumpfile method that creates "live dumps", where
the kernel continues to run while the dumpfile is being created.
The initial system banner display and the "sys" command will inform
the user that the dumpfile is a "[LIVE DUMP]", and the "bt -a" option
will fail with the message "bt: -a option not supported on a live
system or live dump".
(holzheu@linux.vnet.ibm.com)
- Newly-created dumpfiles generated by the "snap.c" extension module
will now be recognized as "live dumps". Accordingly, the initial
system banner display and the "sys" command will inform the user that
the dumpfile is a "[LIVE DUMP]", and the "bt -a" option will fail
with the message "bt: -a option not supported on a live system or
live dump".
(anderson@redhat.com)
- If "bt" alone is attempted on an active task in a "live dump", it
will indicate "(active)", i.e., the same as if it were attempted on
a live system.
(holzheu@linux.vnet.ibm.com)
- If an extension module does not define the appropriate architecture,
i.e., "-DX86", "-DX86_64", etc., then the inclusion of "defs.h" will
generate a compiler failure indicating "error: 'NR_CPUS' undeclared
here (not in a function)". In that case, the architecture will now
default to that of the host machine.
(anderson@redhat.com)
- Prevent a highly-unlikely incorrect calculation of the maximum
cpudata array length of a kmem_cache during initialization of
of CONFIG_SLAB kernels.
(anderson@redhat.com)
- Prevent an infinite loop during the initialization of the kmem_cache
subsystem in CONFIG_SLAB kernels if the cache list or the vmcore is
corrupt. If the kmem_cache list links back into itself, messages
showing the first "duplicate" entry in the list and "crash: unable
to initialize kmem slab cache subsystem" will be displayed.
(anderson@redhat.com)
- Update to the "mod" command to additionally search for module object
files in the directory containing the kernel namelist (vmlinux) file.
This will allow an alternate module-debuginfo directory tree to be
set up like so:
# cd <directory>
# rpm2cpio kernel-debuginfo-<release>.rpm | cpio -idv
Having done that, and by referencing the vmlinux file in that
directory tree directly or by symbolic link, the "mod" command
will search for module object files starting from the directory
containing the vmlinux file if they are not found in the standard
/lib/modules/<release> directory.
(shane.seymour@hp.com, anderson@redhat.com)
- Update to the s390x "bt" command if a task was running in userspace.
Without the patch, the back trace display ended at the kernel entry
function frame; with the patch, the user space PSW register is
displayed with a "(user space)" tag, followed by the general purpose
register set.
(holzheu@linux.vnet.ibm.com)
- In the unlikely event that the access of ARM or x86_64 kernel unwind
table data fails during crash invocation, print a warning message and
allow the crash session to continue. Without the patch, the crash
session would fail immediately.
(Jan.Karlsson@sonymobile.com, anderson@redhat.com)
(04/27/12)
6.0.5-1.fc18
- Fedora Rawhide build: crash-6.0.5-1.fc18 (03/26/12)
http://koji.fedoraproject.org/koji/buildinfo?buildID=309714
6.0.5 - Enhancement to the "foreach" command to allow any of the "name"
arguments to be POSIX extended regular expressions. The expression
string must be encompassed by "'" characters, and will be matched
against the names of all tasks.
(qiaonuohan@cn.fujitsu.com, anderson@redhat.com)
- Fix for the embedded gdb module's "ptype" command, and by extension,
the crash utility's "struct" command, to be able to fully display
embedded structure or union members of a structure/union. Without
the patch, if a structure or union is a member of a structure or
union that is a member of a structure or union, then it is displayed
as "struct {...}" or "union {...}".
(qiaonuohan@cn.fujitsu.com)
- Extend the "ps -l" output to also display the task state next to
its last_run/timestamp value.
(qiaonuohan@cn.fujitsu.com)
- Enhancement to the "foreach" command which adds a new "state"
task-indentifier argument that filters tasks by their task state.
The state argument may be any of the task states displayed by
the "ps" command: RU, IN, UN, ST, ZO, SW or DE.
(rabin@rab.in, anderson@redhat.com)
- Implemented a new pc->cmd_cleanup function pointer and an optional
pc->cmd_cleanup_arg argument that will allow any command to register
a function and an optional argument that will be called after a
command has completed successfully, or more likely, unsuccessfully.
Normally the only cleanup required for a command is the freeing of
buffers that were allocated with GETBUF(), but that is performed
automatically after each command is run. However, with the
introduction of the new POSIX regular expression functionality of
the "foreach" command, there needed to be a way to call regfree()
in the case where where regcomp() was called successfully, but then
the command later encountered one of several fatal error conditions.
This facility is also available for use by extension module commands.
(anderson@redhat.com)
- Enforce the usage of a kernel thread's pgd from its active_mm for
the ARM "vtop -c" command; if its active_mm is NULL, make the command
fail similarly to the other architectures, displaying the error
message "vtop: no active_mm for this kernel thread".
(rabin@rab.in)
- Fix for the x86_64 "bt" command running against recent kernels
if an active task was operating on its IRQ stack when the crash
occurred. Without the patch, the determination of the IRQ exception
frame was off-by-8, displaying invalid register data and the error
message "bt: WARNING: possibly bogus exception frame".
(anderson@redhat.com)
- Update to handle the vfsmount structure change in 3.3 kernels, in
which most members of the vfsmount structure have been moved into a
new "struct mount", and the vfsmount structure has been embedded in
the new mount structure. Without the patch, the following commands
will fail, displaying the following error messages:
mount: "mount: invalid structure member offset: vfsmount_mnt_list"
files: "files: invalid structure member offset: dentry_d_covers"
vm: "vm: invalid structure member offset: dentry_d_covers"
swap: "swap: invalid structure member offset: dentry_d_covers
fuser: "files: invalid structure member offset: dentry_d_covers"
The "fuser" command generates the above error because it uses the
"files" command behind the scenes.
(qiaonuohan@cn.fujitsu.com)
- Fix for the "ps" command to prevent the display of "??" under the
ST (task state) column. Without the patch, in more recent kernels,
if more than one bit were set in the task_struct.state field, the
state would display "??". With the fix, the primary state will
always be displayed.
(anderson@redhat.com)
- Update to the output of the "set" command when it displays a
task's state. Without the patch, if more than one bit was set
in the task_struct.state field, "STATE: (unknown)" would be
displayed. With the fix, all bits in both the task_struct.state
and task_struct.exit_state fields are translated.
(anderson@redhat.com)
- Implemented a new "vm -P <vma-address>" option, which is similar
to "vm -p", but only does the page translations of the specified
VM area of a context.
(qiaonuohan@cn.fujitsu.com)
- Add support for the Freescale PowerPC e500mc version of the E500
processor chipset, and rework the PPC platform-specific code in
order to more easily support new processors.
(nakayama.ts@ncos.nec.co.jp)
- Implemented a new "gdb" crash environment variable that can be used
to alter a crash session's behavior such that all commands are passed
directly to the embedded gdb module. The new mode is turned on and
off by entering "set gdb on" and "set gdb off". When running in this
mode, the command prompt will be "gdb>". In order to execute native
crash commands while running in this mode, precede the command with
the "crash" directive, for example, "crash ps".
(bruce.korb@gmail.com, anderson@redhat.com)
- Fix for a "*** stack smashing detected ***: crash terminated" failure
during the initial system banner display on a 32-bit PPC platform.
(nakayama.ts@ncos.nec.co.jp)
- Redesigned/simplified the internal read_string() function to prevent
a potential segmentation violation.
(anderson@redhat.com)
- Updates for the 32-bit PPC "vtop" command output:
(1) Translate kernel virtual addresses for FSL BOOKE by using
the TLBCAM setting
(2) Remove the PMD line from the display
(3) Fix the displayed PHYSICAL values of FSL BOOKE PTE format
(nakayama.ts@ncos.nec.co.jp)
- Fix for crash invocation failure on 3.3-era kernels in which the
the former standalone "xtime" timespec structure has been moved into
the "timekeeper" structure. Without the patch, the crash session
would fail early on with the message "crash: cannot resolve: xtime".
The patch also prevents the crash session failure in the unlikely
event that the timespec access fails.
(anderson@redhat.com)
(03/23/12)
6.0.4 - Fix to allow the recently-added "mod -g" and "mod -r" options to be
used together. Without the patch, if both options were used, the
command would fail with a "mod: invalid option" error complaining
about one or the other option letter.
(bob.montgomery@hp.com)
- Additional update for 3.1.x and later kernels configured with
CONFIG_SLAB, which have replaced the kmem_cache.nodelists[] array
with a pointer to an outside array. Without the patch, depending
upon a system's cpu configuration and actual cpu count, the crash
session may display "crash: unable to initialize kmem slab cache
subsystem" during invocation, or if it does succeed, "kmem -s" may
generate a segmentation violation.
(bob.montgomery@hp.com, anderson@redhat.com)
- Document the "crash [-h|--help] all" option in the crash.8 man
page and in the "crash [-h|--help]" output.
(anderson@redhat.com)
- Fix the S390/S390X-specific "s390dbf" command's "hex_ascii" debug
data printing routine to prevent the display of non-ASCII characters.
(holzheu@linux.vnet.ibm.com)
- Fix for ARM stack unwinding on 3.2 and later kernels due to commit:
http://git.kernel.org/linus/de66a979012dbc66b1ec0125795a3f79ee667b8a
(rabin@rab.in)
- Implemented a new "search -x <count>" option that displays the
memory contents before and after any found search target. The
before and after memory context will consist of "count" memory items
of the same size as the searched-for value. This option is not
applicable with the -c option.
(zhangyanfei@cn.fujitsu.com)
- Fix for the x86_64 Xen hypervisor "bt" command. Without the patch,
the contents of the RDX register in exception frames incorrectly
shows the contents of the RCX register.
(s.ikarashi@jp.fujitsu.com)
- Implementation of a platform-based vmalloc address translation scheme
for the 32-bit PPC architecture, introducing suppport for the PPC44X
platform while maintaining the current default platform. Related to
that, the PTE translation function used by "vtop" properly handles
platforms that use 64-bit PTEs, and the "mach" command displays the
kernel's "powerpc_base_platform" name string.
(suzuki@in.ibm.com)
- Fix for the usage of native gdb commands where the command output is
redirected to a pipe and then redirected to a file. Without the
patch, a command construct such as:
crash> gdb-command | shell-command > output-file
would cause the embedded gdb module to fail with the error message
"gdb: gdb request failed: gdb-command | shell-command".
(anderson@redhat.com)
- Fix to prevent a crash session that is run over a network connection
that is killed/removed from going into 100% cpu-time loop. The fix
that went into crash-5.0.2 to handle the change in behavior of the
built-in readline() library call does not suffice in cases where
readline() never gets a chance to be called. Accordingly, the crash
session is now initialized with a PR_SET_PDEATHSIG prctl setting,
which will cleanly kill itself upon its parent's death.
(anderson@redhat.com)
- Fix for the support of PPC64 compressed kdumps, a regression that
was introduced in crash-6.0.3 when support for 32-bit PPC compressed
kdumps was implemented. Without the patch, the crash session
fails to initialize, showing this warning message:
WARNING: machine type mismatch:
crash utility: PPC64
vmcore: PPC
followed by the error: "crash: vmcore: not a supported file format".
(anderson@redhat.com)
- Fix for the x86_64 "bt" command to prevent the possible skipping
of the stack frame just above an exception frame that indicates
"[exception RIP: unknown or invalid address]". This highly-unlikely
event could occur if the kernel jumps to a bogus text location and
attempts to execute it, or if the exception occurs in vmalloc space
that was allocated with module_alloc() by a systemtap kprobe-handler,
and therefore has no symbolic reference.
(anderson@redhat.com)
(02/24/12)
6.0.3 - Fix to gdb-7.3.1/bfd/bfdio.c to properly zero out a complete struct
stat with a corrected memset argument; caught when compiling with
the Clang Static Analyzer.
(idoenmez@suse.de)
- Fix for the SIAL extension module to remove a call to sial_free()
for an uninitialised variable that can result in a segmentation
violation when unloading a sial script.
(lmcilroy@redhat.com)
- Fix for the "runq" command for kernels that are configured with
CONFIG_FAIR_GROUP_SCHED. Without the patch, tasks contained within
the task-group of a cpu's currently-running task may not be displayed.
(d.hatayama@jp.fujitsu.com, anderson@redhat.com)
- Implemented support for the analysis of 32-bit PPC ELF kdump vmcores.
(suzuki@in.ibm.com)
- Implemented the capability of building a PPC crash binary on a PPC64
host, which can be done by entering "make target=PPC". After the
initial build is complete, subsequent builds can be done by entering
"make" alone.
(suzuki@in.ibm.com, anderson@redhat.com)
- Determine the PPC page size from the kdump PAGESIZE vmcoreinfo data.
(suzuki@in.ibm.com, anderson@redhat.com)
- Fix for the "kmem -[sS]", "kmem -[fF]" and "kmem <address>"
options in 3.2 kernels. Without the patch, the commands fail with
the error "kmem: invalid structure member offset: page_lru".
(anderson@redhat.com)
- Addition of a set of dumpfile read diagnostic debug statements. They
are primarily of use when dealing with kdump invocation or runtime
read failures (ELF kdumps or compressed kdumps), and can serve to
help pinpoint the problem as a faulty/corrupted dumpfile vs. a crash
utility bug. Some statements are seen when invoking crash with "-d1",
more with "-d4", and all of them with "-d8". During runtime, debug
statements may be seen by entering "set debug <level>".
(dmair@suse.com, anderson@redhat.com)
- Fix for X86 kernels that have CONFIG_X86_32, CONFIG_DISCONTIGMEM,
CONFIG_DISCONTIGMEM_MANUAL and CONFIG_NUMA all configured. Without
the patch, the VM subsystem fails to initialize properly because the
pgdat structures are allocated by the remap allocator.
(ptesarik@suse.cz)
- Fix for the "vtop" command on large NUMA X86 kernels where a node's
starting physical address is larger than 32-bits. Without the patch,
the page struct contents of a virtual address may not be displayed.
Associated with that fix, the "kmem -n" line that displays a node's
MEM_MAP, START_PADDR and START_MAPNR values has been adjusted to more
properly handle large physical addresses.
(dmair@suse.com)
- Update for the ARM architecture to recognize a recent change of
its vmlinux section name from ".init" to ".init.text". Without the
patch, a warning message indicating "crash: cannot determine text
init space" is displayed during initialization.
(rabin@rab.in)
- Significant speed increase of the "kmem -p" command, especially on
large-memory systems.
(qiaonuohan@cn.fujitsu.com)
- Implemented new "irq -a" and "irq -s" options. The "irq -a" option
displays the cpu affinity for in-use IRQs. The "irq -s" option
displays per-cpu IRQ stats in a similar manner to /proc/interrupts
for all cpus. To show a limited set of per-cpu IRQ stats, there is
an associated "-c" option that limits the cpus shown, which can be
expressed as "-c 1,3,5", "-c 1-3", or "-c 1,3,5-7,10". The options
are currently restricted to X86, X86_64, ARM, PPC64 and IA64.
(zhangyanfei@cn.fujitsu.com, anderson@redhat.com,
per.xx.fransson@stericsson.com)
- Removal of a redundant read of the kernel's __per_cpu_offset pointers
in the ARM architecture's arm_get_crash_notes() function.
(per.xx.fransson@stericsson.com)
- Fix for an ARM architecture segmentation violation because of a stack
overflow due to recursion in the page table translation code. This
was seen when analyzing a dumpfile where the page tables had been
corrupted.
(rabin@rab.in)
- Fix for the the "FREE HIGH" tally in the X86 "kmem -i" display.
Without the patch, the PAGES, TOTAL and PERCENTAGE values would
always show zero values.
(anderson@redhat.com)
- Fix for the "kmem -n" output display for 32-bit architectures that
are configured with CONFIG_SPARSEMEM. Without the patch, the values
under the CODED_MEM_MAP, MEM_MAP and PFN columns are all shifted to
the left.
(anderson@redhat.com)
- Cleanup of several SIAL extension module files to address bison 2.5
and gcc 4.4.3 compile-time warnings.
(lchouinard@s2sys.com)
- Fix for "net -[sS]" command options on the ARM architecture. Without
the patch, invalid data would be displayed because the calculation of
the socket address was off by 4 bytes.
(Jan.Karlsson@sonyericsson.com, anderson@redhat.com)
- Fix for the ARM "bt" command to allow the core kernel unwind tables
to be used in cases where the module unwind tables are inaccessible.
(rabin@rab.in)
- Implementation of a new "dev -d" option that displays disk device
I/O statistics. For each disk device, its major number, gendisk and
request_queue addresses are displayed along with the total number of
allocated I/O requests that are in-progress. The total I/O requests
are then split out into synchronous vs. asynchronous counts (or reads
vs. writes in older kernels), and the number that are in-flight in
the device driver.
(wency@cn.fujitsu.com)
- Update for 3.1.x and later kernels configured with CONFIG_SLAB, which
have replaced the kmem_cache.nodelists[] array with a pointer to an
outside array. Without the patch, the crash session fails during
invocation with the error "crash: zero-size memory allocation!".
(bob.montgomery@hp.com, anderson@redhat.com)
- Implemented support for the analysis of 32-bit PPC compressed kdump
vmcores.
(suzuki@in.ibm.com)
- Prevent the "runq" command from dumping an unending loop of tasks if
the CFS runqueue has been corrupted. If the output of a cpu's
runqueue would display a duplicate task, the output will stop with
the message "WARNING: duplicate CFS runqueue node: task <address>".
(dmair@suse.com)
- Repurposed/renamed the rarely-used and rarely-needed "mod -r" option
to "mod -R". The option is used to reinitialize the module data; all
currently-loaded symbolic and debugging data is deleted, and the
installed module list will be updated (live systems only).
(anderson@redhat.com)
- Implemented a new "mod -r" option, which will pass the "-readnow"
flag to the embedded gdb module, which will override the two-stage
strategy that it uses for reading symbol tables from module object
files. If the crash session was invoked with the "--readnow" flag,
then the same override will occur automatically. It should be noted
that doing will increase the virtual and resident memory set size.
(anderson@redhat.com)
- Performance increase for the "kmem -s <address>" option on
kernels configured with CONFIG_SLAB, most notably on kernels
whose kmem_cache.array[NR_CPUS] array is several pages in size.
(qiaonuohan@cn.fujitsu.com)
- Require that the "<slabname>" argument to "kmem -s <slabname>"
be escaped with a '\' character in two situations:
(1) in the highly-unlikely case of a kmem_cache slab named "list",
to prevent the ambiguity with the "kmem -s list" command option.
(2) if the first character of the <slabname> actually is a '\'
character.
(anderson@redhat.com)
(02/03/12)
6.0.2-1.fc17
- Fedora 17 build: crash-6.0.2-1.fc17 (01/04/12)
http://koji.fedoraproject.org/koji/buildinfo?buildID=281105
6.0.2-1.fc16
- Fedora 16 build: crash-6.0.2-1.fc16 (01/04/12)
http://koji.fedoraproject.org/koji/buildinfo?buildID=281111
6.0.2 - Implemention of a new "arguments-input-file" feature, where an input
file containing crash command arguments may be iteratively fed to
a crash command. For each line of arguments in an input file, the
selected crash command will be executed. Taking a simple example,
consider an a file named "input" which contains several task_struct
addresses:
crash> cat input
ffff88022bdc2080
ffff88012ae78ac0
ffff88012c334b00
ffff88012c335540
crash>
Each line in the input file may be passed to a crash command by
entering the redirection character followed by the filename:
crash> ps < input
PID PPID CPU TASK ST %MEM VSZ RSS COMM
5752 624 5 ffff88022bdc2080 IN 0.0 12340 2584 udevd
PID PPID CPU TASK ST %MEM VSZ RSS COMM
5779 4927 1 ffff88012ae78ac0 IN 0.0 97820 3916 sshd
PID PPID CPU TASK ST %MEM VSZ RSS COMM
5956 1 3 ffff88012c334b00 IN 0.0 27712 868 auditd
PID PPID CPU TASK ST %MEM VSZ RSS COMM
5784 5779 2 ffff88012c335540 IN 0.0 108392 1856 bash
crash> struct task_struct.pid,mm < input
pid = 5752
mm = 0xffff88022ab65100
pid = 5779
mm = 0xffff88012c272180
pid = 5956
mm = 0xffff88012b00f7c0
pid = 5784
mm = 0xffff88012ae30800
crash>
The input file may contain data containing anything that can be
inserted into a given crash command line. There is no restriction
on the number of arguments in each line; essentially the data in
each input file line will be inserted into the command line starting
where the "<" character is located, and any intervening whitespace
and the filename will be removed. However, because pipes and output
redirection are set up prior to the insertion of input file data,
pipe or redirection should not be put on input file lines. If that
is attempted, the arguments will just be passed to the command, with
unpredictable results. However, output can be piped or redirected
the same way as can be done with normal commands:
crash> set < input | grep -e COMMAND -e CPU
COMMAND: "udevd"
CPU: 5
COMMAND: "sshd"
CPU: 1
COMMAND: "auditd"
CPU: 3
COMMAND: "bash"
CPU: 2
crash>
Many thanks to Josef Bacik for proposing this feature.
(anderson@redhat.com)
- Fix for the "runq" command for kernels configured with
CONFIG_FAIR_GROUP_SCHED. Without the patch, it is possible
that a task may be listed twice in a cpu's CFS runqueue.
(d.hatayama@jp.fujitsu.com)
- Fix for the internal parse_line() function to properly handle the
case where the first argument in a line is a string argument that is
encapulated with quotation marks.
(anderson@redhat.com)
- Fix for the usage of gzip'd vmlinux file that was compressed with
"gzip -n" or "gzip --no-name" without using "-f" on the command line.
Without the patch, the crash session fails with an error message that
indicates "crash: <string-containing-garbage>: compressed file name
does not start with vmlinux". With the patch, if such a file is used
without "-f", it will be accepted with a message that indicates that
the original filename is unknown, and a suggestion that "-f" be used
to prevent the message.
(anderson@redhat.com)
- Added a new "mod -g" option that enhances the symbol display for
kernel modules. After loading a module's debuginfo data, the module
object's section addresses will be shown as pseudo-symbols, like this
simple example using the crash memory driver module:
crash> mod -g -s crash
... [cut] ...
crash> sym -m crash
ffffffff88edb000 MODULE START: crash
ffffffff88edb000 [.text]: section start
ffffffff88edb000 (t) crash_llseek
ffffffff88edb01d (t) crash_read
ffffffff88edb18c [.text]: section end
ffffffff88edb18c [.exit.text]: section start
ffffffff88edb18c (T) cleanup_module
ffffffff88edb18c (t) crash_cleanup_module
ffffffff88edb198 [.exit.text]: section end
ffffffff88edb2a0 [__versions]: section start
ffffffff88edb2a0 (r) ____versions
ffffffff88edb2a0 (r) __versions
ffffffff88edb4a0 [__versions]: section end
ffffffff88edb920 [.data]: section start
ffffffff88edb920 (d) crash_dev
ffffffff88edb960 (d) crash_fops
ffffffff88edba48 [.data]: section end
ffffffff88edba80 [.gnu.linkonce.this_module]: section start
ffffffff88edba80 (D) __this_module
ffffffff88ee3c80 [.gnu.linkonce.this_module]: section end
ffffffff88ee3cc1 MODULE END: crash
crash>
The option may also be used in conjunction with "mod -S".
(nakayama.ts@ncos.nec.co.jp)
- Fix for the "gdb" command to prevent the option handling of command
lines. Without the patch, a gdb command string that contained a
"-<character>" pair preceded by whitespace, would fail with the
error message "gdb: gdb: invalid option -- <character>".
(anderson@redhat.com)
- Fix for the panic-task determination if a dumpfile is taken on a
system that actually has a cpu count that is equal to its per-arch
NR_CPUS value. Without the patch, the task running on the cpu
whose number is equal to NR_CPUS-1 would be selected.
(d.hatayama@jp.fujitsu.com)
- Fix for the x86_64 "bt" command to handle a recursive entry into
the NMI exception stack. While this should normally never happen,
it is possible if, for example, a kprope is entered into a function
that gets executed during NMI handling, and a second NMI is received
after the initial one, corrupting the original exception frame at
the top of the NMI stack. Without the patch, the NMI stack backtrace
and exception frame would be displayed repeatedly; with the patch,
the backtrace and exception frame are followed by the warning message
"NMI exception stack recursion: prior stack location overwritten".
(anderson@redhat.com)
- Support dumpfiles that are created by the PPC64 Firmware Assisted
Dump facility, also known as "fadump" or "FAD". Without the patch,
the panic task cannot be determined from a fadump vmcore which was
subsequently compressed with makedumpfile, and therefore a proper
backtrace of the panic task cannot be generated.
(mahesh@linux.vnet.ibm.com)
- Preparation for new s390x kernels that will increase MAX_PHYSMEM_BITS
from 42 to 46.
(mahesh@linux.vnet.ibm.com, holzheu@linux.vnet.ibm.com,
anderson@redhat.com)
(12/22/11)
6.0.1 - Several fixes/updates for the 32-bit PPC architecture:
(1) Delete "__func__.<number>" symbols from the symbol list.
(2) Update manner of determining the processor speed displayed
by the initial system banner and the "sys" command.
(3) Use the kernel's online cpus mask for determining the cpu count.
(4) Enable the "bt" command to follow traces that start in a per-cpu
IRQ stack.
(5) Fix for the "bt" command to better prevent runaway stack traces.
(6) Fix for the "bt" command to recognize/display 2.6 kernel
exception frames.
(7) Update "bt" command's exception frame register display.
(8) Implement "bt -f" option.
(nakayama.ts@ncos.nec.co.jp)
- Fix for the X86 kernel module line-number capability on some kernels.
It is unclear why only some kernel versions exhibit this problem,
but the newly-embedded gdb version 7.3.1 has changed behaviour such
that the addrmap arrays of module text address blocks may contain
the module text offset values instead of their loaded vmalloc
addresses, and so without the patch, there is no "match" for the
vmalloc address when searching for its line number information.
It is fixed by doing a preliminary symbol search before accessing
the line-number access routine.
(anderson@redhat.com)
- Fix for the X86_64 kernel module line-number capability on kernels
that have functions preceded by the __vsyscall_fn macro, which
puts the kernel text function in the vsyscall page that starts
at virtual address 0xffffffffff600000. This results in a text
address block that starts at a normal kernel text address but
ends with a vsyscall address, which inadvertently contains the
whole vmalloc address range. Without the patch, line number
requests for module vmalloc text addresses would be mistakenly
issued the first text section that ended with a vsyscall address,
but then cannot find line number information in that section.
(anderson@redhat.com)
- Fix for the inadvertent patching of the symbols of the 32-bit Xen
hypervisor binary. Without the patch, during initialization the
patching 3434 gdb minimal_symbol values" is displayed.
(anderson@redhat.com)
- If the "--mod <directory-tree>" command line option, or the
setting of the CRASH_MODULE_PATH environment variable, or the
"mod -S <directory-tree>" point to a tree that contains only the
separate debuginfo "<module>.ko.debug" files, then those
debuginfo files will be used as the internal "add-symbol-file"
arguments to the embedded gdb module. Without the patch, it was
only acceptable to point to a directory tree that contained the
base "<module>.ko" files, and the separate debuginfo files
were found automatically based upon the directory path to the
base module file. This will allow an alternate module-debuginfo
directory tree to be set up like so:
# cd <directory>
# rpm2cpio kernel-debuginfo-<release>.rpm | cpio -idv
Having done that, the <directory> may be used with the "--mod",
command line argument, or as the CRASH_MODULE_PATH environment
variable, or as the "mod -S <directory> argument.
(anderson@redhat.com)
- Make the suspension of the verbose/time-consuming "sym -l" output
immediate upon the killing of the output pipe, or the entry of the
first CTRL-c. Without the patch, it would typically take several
seconds, or multiple CTRL-c entries, for the "crash>" prompt to be
re-displayed.
(anderson@redhat.com)
- Fix for the handling of piped commands if the command receiving
the crash output is non-existent or invalid. Without the patch,
the crash command would wait indefinitely unless multiple CTRL-c
entries were entered.
(anderson@redhat.com)
- Fix for the s390x "bt" command's floating point register display
header. Without the patch, the header indicates that only registers
0, 2, 4 and 6 are printed, a relic of the s390 architecture, whereas
on the s390x all floating point registers are displayed.
(holzheu@linux.vnet.ibm.com)
- Fix for the error message displayed when an untrusted .gdbinit file
exists in the current directory. Without the patch, the error
message "WARNING: not using untrusted file: " would be followed by
garbage ASCII data instead of the full pathname of the .gdbinit file.
(anderson@redhat.com)
- Fix for the "kmem -p" and "kmem -i" commands in 3.1 and later kernels
where the page structure's "_count" member was moved into an embedded
anonymous structure. Without the patch, the commands fail with the
error message "kmem: invalid structure member offset: page_count
FILE: memory.c LINE: 4610 FUNCTION: dump_mem_map_SPARSEMEM()".
(anderson@redhat.com)
- Allow the user to append data to the CFLAGS and LDFLAGS variables in
the top-level Makefile. The extra data should be put in files named
"CFLAGS.extra" and "LDFLAGS.extra" in the top-level directory; if
either or both files exist, the extra data within them will be
appended to the relevant variable. Typically the LDFLAGS.extra file
will contain "-l<library>" strings, and the CFLAGS.extra file will
contain "-D<value>" strings. This will allow the crash utility to
be built with optional libraries, and the code that references them
to be encapsulated with associated "#ifdef <value>" sections. The
extra CFLAGS data will also be passed to extension modules that are
built within the local "crash-<version>/extensions" subdirectory.
(anderson@redhat.com)
- The LDFLAGS setting in the Makefile can no longer be modified by
hand. It will be automatically configured by the "configure -b"
option, based upon the contents of the optional "LDFLAGS.extra" file.
(anderson@redhat.com)
- Fix for the "runq" command to display the runnable tasks that
are contained within a cgroup's task-group scheduling entity.
Without the patch, only scheduling entities that are individual
tasks get displayed, and runnable tasks in task-group scheduling
entities get skipped.
(d.hatayama@jp.fujitsu.com, anderson@redhat.com)
- Fix for the SIAL extension module when repeatedly loading and
unloading a sial script when a full pathname is specified for the
script. Without the patch, the 4th unload attempt generates a
segmentation violation.
(lmcilroy@redhat.com)
- Fix for the SIAL extension module to register the help and usage
functions for a command only when loading a script.
(lchouinard@s2sys.com)
(11/30/11)
6.0.0-1.fc17
- Fedora 17 build: crash-6.0.0-1.fc17 (10/26/11)
http://koji.fedoraproject.org/koji/buildinfo?buildID=270760
6.0.0 - Updated the embedded gdb version to FSF gdb-7.3.1. This change is
required for kernels built with gcc-4.6.1, which now defaults to
using -gdwarf-4. When using prior versions of crash on such a
vmlinux file, it fails immediately with the message "Dwarf Error:
wrong version in compilation unit header (is 4, should be 2) [in
module vmlinux]" followed by "crash: vmlinux: no debugging data
available".
(anderson@redhat.com)
- Incremental patch for the SADUMP dumpfile support that was introduced
in crash-5.1.8. The patchset fixes minor bugs, cleans up the sadump
module, addresses the issue of gathering the first 640KB backup from
a kdump-enabled kernel, prepares for makedumpfile's support of the
SADUMP format, and has "bt" display the stored register set when the
compressed kdump was generated from an SADUMP dumpfile.
(d.hatayama@jp.fujitsu.com)
- Fix for the "gdb" command, or any command that resolves to a gdb
command, to allow redirection to a pipe or file. This addresses a
regression that was introduced by an unrelated "gdb" command fix
in crash-5.1.4 that prevented the stripping of quotation marks from
the input line. Without the patch, redirection of a "gdb" command
to a pipe or file fails with the error message "gdb: gdb request
failed: <original-command-line-including-redirection>".
(anderson@redhat.com)
- Fix for live system analysis of 32-bit PPC kernels. Without the
patch, the session would fail after displaying the error message:
WARNING: machine type mismatch:
crash utility: PPC
vmlinux: (unknown)
(nakayama.ts@ncos.nec.co.jp)
- Fix to allow vmalloc memory access on 32-bit PPC kernels. Without
the patch, the warning message "WARNING: cannot access vmalloc'd
module memory" would be displayed during invocation, and kernel
virtual memory that was vmalloc'd could not be accessed.
(nakayama.ts@ncos.nec.co.jp)
- Fix to correctly gather task addresses from 32-bit PPC kernels.
Without the patch, during invocation a stream of error messages
indicating "crash: invalid task address in pid_hash: <address>"
would be displayed.
(nakayama.ts@ncos.nec.co.jp)
- Fix for the "bt" command in 32-bit PPC kernels. Without the patch,
the "bt" command would generate a segmentation violation.
(nakayama.ts@ncos.nec.co.jp)
(10/25/11)
5.1.9 - Fixed the compressed kdump panic task determination function to use
the kernel's "crashing_cpu" symbol if it exists. Without the patch,
the function returned 0 because it was using diskdump-specific header
variables that are always set to zero in compressed kdump dumpfiles;
the panic task was then found by searching the kernel stacks of all
of the active tasks.
(anderson@redhat.com)
- Fix for the potential of false-positive warning messages during the
initialization of s390x zdump dumpfiles that would indicate either
"WARNING: multiple active tasks have called die and/or panic" and/or
"WARNING: multiple active tasks have called die".
(holzheu@linux.vnet.ibm.com)
- Removal of superfluous code for gathering registers from the ELF
header in the ARM get_netdump_regs_arm() function.
(per.fransson.ml@gmail.com)
- Additional fixes for the ARM architecture gdb-7.0/bfd/elf32-arm.c and
gdb-7.0/bfd/cpu-arm.c files to handle gcc-4.6 compiler failures.
Without the patch, gcc-4.6 generates "error: variable ‘<variable>’
set but not used [-Werror=unused-but-set-variable]" fatal errors when
the (default) -Werror flag is used. Previous gcc versions considered
local variables were simply set to some value to be "used", but that
is no longer the case.
(anderson@redhat.com)
- Added new "dis -[xd]" options, which override the current default
output format with hexadecimal or decimal format for just the command
instance. Without the patch, it would require changing the default
output format with "hex" or "dec" prior to executing "dis".
(anderson@redhat.com)
- Added new "task -[xd]" options, which override the current default
output format with hexadecimal or decimal format for just the command
instance. Without the patch, it would require changing the default
output format with "hex" or "dec" prior to executing "task". The
new flags may be used with "foreach task" as well.
(anderson@redhat.com)
- Prevent the "struct -[xd]", "union -[xd]", and "p -[xd]" commands
from allowing both options being entered on the command line.
(anderson@redhat.com)
- Fixes to top-level crash source files filesys.c, memory.c, netdump.c,
sadump.c, symbols.c, x86.c and lkcd_x86_trace.c to allow them to be
compiled cleanly with gcc-4.6. Without the patch, gcc-4.6 generates
fatal errors indicating "error: variable ‘<variable>’ set but not
used [-Werror=unused-but-set-variable]" when building crash with
"make Warn", or generates similar warning messages when building with
"make warn". This has been tested only on x86, x86_64 and ARM; the
other architectures may still generate errors/warnings when compiling
their machine-specific files with gcc-4.6.
(anderson@redhat.com)
- Fix for the "irq" command on 2.6.39 and later kernels. Without the
patch, the command fails with the message "irq: invalid structure
member offset: irq_desc_t_status".
(anderson@redhat.com)
- Fix for the SIAL extension module that solves the problem of getting
access to integer variables.
(makc@gmx.co.uk)
- Fix for compiler warnings when building the extensions/sial.so
extension module with recent versions of /usr/bin/ld. Without the
patch, two warning messages are displayed: "/usr/bin/ld: Warning:
alignment 4 of symbol 'sialppdebug' in /tmp/ccYSzE2s.o is smaller
than 16 in libsial/libsial.a(sialpp.tab.o)" and "/usr/bin/ld:
Warning: alignment 4 of symbol 'sialdebug' in /tmp/ccYSzE2s.o is
smaller than 16 in libsial/libsial.a(sial.tab.o)".
(maxc@gmx.co.uk)
- If the stack pointer found in the register set stored in the ELF
header of a compressed kdump dumpfile, a KVM dumpfile, or an SADUMP
dumpfile is either NULL or cannot be accessed, the register set will
be dumped after the error message. Without the patch, only the error
message was displayed.
(anderson@redhat.com)
- Preparation of the top-level crash sources for more efficient updates
of the embedded gdb version. The changes should be invisible other
than the fact that all top-level source files will now be compiled
with the -DGDB_xxx flag, because the gdb-defined TYPE_CODE_xxx values
that are exported in defs.h changed in more recent gdb versions.
(anderson@redhat.com)
- Fixes for potential segmentation violations during the panic task
search phase of session initialization from a version 4 or later
x86_64 compressed kdump, in which the number of ELF NT_PRSTATUS
notes in the dumpfile does not match the number of cpus running
when the system crashed.
(Joe.Lawrence@stratus.com, anderson@redhat.com)
- Created an exported set_tmpfile2() function that allows the caller
to pass in their own FILE pointer of an open file that only exists
during the execution of a command. It will afford the recursive-use
protection of open_tmpfile2() plus the automatic closure of the file
if the command fails prior to completion or if the user forgets to
close it with close_tmpfile2().
(anderson@redhat.com)
- Created a new "rd -r <outputfile>" option that copies raw data
from memory to an output file. It can be invoked either of two
possible manners:
crash> rd -r <outputfile> <address> <count>
crash> rd -r <outputfile> <address> -e <ending-address
The <count> value is always a byte count with this option.
(adrian.wenl@gmail.com, anderson@redhat.com)
- Fix for the ARM "bt" command to store the correct value of the fp
register of active tasks. Without the patch, in rare circumstances,
the output may show an empty backtrace.
(per.xx.fransson@stericsson.com)
- Fix to prevent a harmless warning message when /proc/kallsyms is used
as a mapfile argument. Without the patch, during initialization,
the message "crash: /proc/kallsyms: lseek: Invalid argument" is
displayed. If a regular file copy of /proc/kallsyms is used, the
message is not displayed.
(anderson@redhat.com)
- Fix for running against live x86 kernels that have been relocated
by the Intel Trusted Boot or "tboot" facility. Without the patch,
a live crash session fails during invocation with the error message
"crash: vmlinux and /dev/mem do not match!" (or "/dev/crash" if
applicable). As a work-around, "/proc/kallsyms" can be entered on
the command line, or the "--reloc=<size>" option can be used, but
this fix obviates that requirement for live systems.
(anderson@redhat.com)
- Fix for the unlikely event where makedumpfile-generated s390/s390x
compressed kdumps do not have a CPU count in the dumpfile header.
This can happen when older s390 dump tools are used to create a dump
that do not write the CPU information into the s390 dump header.
Without the patch, the warning message "crash: compressed kdump:
invalid nr_cpus: 0" is displayed, the dumpfile is not recognized
as a compressed kdump, and the session fails. Since s390/s390x have
a fallback function that gets the CPU register information out of
memory, the same warning message will be displayed, but the dumpfile
will still be recognized as a compressed kdump.
(holzheu@linux.vnet.ibm.com)
- Fix for the "net -s" command on 2.6.38 and later kernels. Without
the patch, the command fails with the error message "net: invalid
structure member offset: inet_opt_daddr".
(bob.montgomery@hp.com, anderson@redhat.com)
(10/17/11)
5.1.8-1.fc15
- Fedora 15 build: crash-5.1.8-1.fc15 (09/20/11)
http://koji.fedoraproject.org/koji/buildinfo?buildID=264627
5.1.8-1.fc16
- Fedora 16 build: crash-5.1.8-1.fc16 (09/21/11)
http://koji.fedoraproject.org/koji/buildinfo?buildID=264832
5.1.8-1.fc17
- Fedora 17 build: crash-5.1.7-2.fc17 (09/20/11)
http://koji.fedoraproject.org/koji/buildinfo?buildID=264624
5.1.8 - Fixes for gdb-7.0 ppc64/ppc-specific files to handle gcc-4.6 compiler
failures. Without the patch, gcc-4.6 generates "error: variable
‘<variable>’ set but not used [-Werror=unused-but-set-variable]"
fatal errors when the (default) -Werror flag is used. Previous gcc
versions considered local variables were simply set to some value to
be "used", but that is no longer the case.
(anderson@redhat.com)
- Add support for the "bt" command to recognize the new s390x
"restart_stack" used by the PSW restart interrupt in 3.0.1 and
later kernels.
(holzheu@linux.vnet.ibm.com)
- Enhancement to the s390x "bt" command to display the register
contents of the pt_regs strucutre for interrupts, instead of just
printing the string "- Interrupt -". The pt_regs structure contains
all of the current registers and PSW of the interrupted CPU.
(holzheu@linux.vnet.ibm.com)
- Removed the "files -l" option, which does not support 2.6 or later
kernels, and because it requires structure offset data that can only
be determined if the "lockd" and "nfsd" modules have been built into
the kernel. Given the kernel module dependencies, the command is
more suitable as an extension module, if anyone cares to carry on
its legacy.
(anderson@redhat.com)
- Fix for the "ps" command to disallow the mutually-exclusive "-u"
and "-k" options from being entered together. Without the patch,
whichever of the two options was entered last was acted upon.
Also, the help page was clarified by separating the three process
identifier formats from the "-u", "-k" and "-G" qualifiers.
(anderson@redhat.com)
- Fix for the "ps" command to disallow the mutually-exclusive "-a",
"-t", "-c", "-p", "-g", "-l" and "-r" options from being entered
together. Without the patch, whichever of the seven options that was
entered last was acted upon.
- Added new "struct -[xd]" and "union -[xd]" options, which override
the current default output format with hexadecimal or decimal format
for just the command instance. The "-o" member offset values and
the structure size value are also controlled by the new options.
Without the patch, it would require changing the default output
format with "hex" or "dec" prior to executing the "struct" or "union"
command.
(anderson@redhat.com)
- Fix for the "fuser" command, which may occasionally precede its
output with the message "WARNING: FILE_NRHASH has changed from 32"
on 2.6.19 and later kernels. The message is harmless.
(anderson@redhat.com)
- Exported new set_temporary_radix() and restore_current_radix()
functions, which are used to temporarily override the current
output radix setting.
(anderson@redhat.com)
- Fixes for ARM gdb-7.0/bfd/elf32-arm.c file to handle gcc-4.6 compiler
failures. Without the patch, gcc-4.6 generates "error: variable
‘<variable>’ set but not used [-Werror=unused-but-set-variable]"
fatal errors when the (default) -Werror flag is used. Previous gcc
versions considered local variables were simply set to some value to
be "used", but that is no longer the case.
(anderson@redhat.com)
- Cosmetic fix for command-failure "Usage" messages to prevent the
output from exceeding 80 columns.
(anderson@redhat.com)
- Implemented a new "struct -p" option which can be used to dereference
pointer members and display the target data. The option can be used
with the struct_name.member[,member] format, or if not, all pointers
in the structure will be dereferenced. If the member is a pointer,
the member's data type will be prepended to the member name when
displaying the target address; on the subsequent line(s) the target's
symbol name will be displayed in brackets if appropriate, and if
possible, the target data will be displayed. For example, currently
to display an mm_struct's "pgd" member:
crash> mm_struct.pgd ffff810022e7d080
pgd = 0xffff81000e3ac000
crash>
The -p option shows the data type of "pgd", dereferences the pointer
value, and with -x, displays the target's contents in hexadecimal
regardless of the current output format:
crash> mm_struct.pgd ffff810022e7d080 -px
pgd_t *pgd = 0xffff81000e3ac000
-> {
pgd = 0x2c0a6067
}
crash>
Here the "thread_info" and "binfmt" members of a task_struct
are dereferenced and the targets displayed:
crash> task_struct.thread_info,binfmt ffff8100181190c0 -p
struct thread_info *thread_info = 0xffff810023c06000
-> {
task = 0xffff8100181190c0,
exec_domain = 0xffffffff802f78e0,
flags = 128,
status = 1,
cpu = 3,
preempt_count = 0,
addr_limit = {
seg = 18446604435732824064
},
restart_block = {
fn = 0xffffffff80095a52 <do_no_restart_syscall>,
arg0 = 0,
arg1 = 0,
arg2 = 0,
arg3 = 0
}
}
struct linux_binfmt *binfmt = 0xffffffff80305540
-> <elf_format> {
next = 0xffffffff80305500,
module = 0x0,
load_binary = 0xffffffff80017d99 <load_elf_binary>,
load_shlib = 0xffffffff800838c3 <load_elf_library>,
core_dump = 0xffffffff80086465 <elf_core_dump>,
min_coredump = 4096
}
crash>
When a .member is not specified, all pointers in the structure
are dereferenced, which may be quite verbose depending upon
the structure.
(anderson@redhat.com)
- Implemented support for "SADUMP" dumpfiles, which are created by the
Fujitsu Stand Alone Dump facility. The dump-creation mechanism is
based in hardware-specific firmware, generating a dumpfile in three
different formats: sadump dump device (single partition), sadump dump
device (disk set), and archive file formats. The crash utility
recognizes all three formats.
(d.hatayama@jp.fujitsu.com)
- Fix for the "bt" command to display Control registers 8-15 (s390x and
s390) and floating point registers 8-15 (s390x only) correctly.
Without the patch, the register content was copied from the wrong
location of the save area, and the wrong register values were
displayed for the active tasks.
(holzheu@linux.vnet.ibm.com)
- Fix for 2.6.34 ppc64 kernels, which were changed to dynamically
allocate the paca structure, and changed the data type of "paca"
symbol from array to a paca_struct pointer.
(mahesh@linux.vnet.ibm.com)
- Fix for 2.6.36 and later ppc64 kernels, which overwrite the paca
pointer variable to point to a static paca during a crash sequence
just prior to the kexec of the secondary kernel, which contains a
paca_struct.data_offset value that is valid only for crashing cpu.
However, the kernel change also re-introduced the __per_cpu_offset
array, which had been removed in 2.6.15, which will be used as an
alternative to the per-cpu paca_struct.
(mahesh@linux.vnet.ibm.com)
- The new version of makedumpfile, 1.4.0, contains a facility that
allows a user to filter out kernel data (e.g., security keys,
confidential/secret information, etc.) from a vmcore. The data
that is filtered out is poisoned with character 'X' (0x58). A
filtered ELF kdump vmcore now contains a new "ERASEINFO" ELF note
section that contains the filter data strings used by makedumpfile.
A filtered compressed kdump has a header version number 5, and
contains new offset_eraseinfo and size_eraseinfo members in its
sub-header that point to a copy of the filter data strings. In most
cases, the erased kernel data will be inconsequential to the crash
session, but it is certainly possible that the removal of crucial
kernel data that the crash utility needs may cause the crash session
to fail, cause individual commands to fail, or result in other
unpredictable runtime behaviour. This patch detects whether kernel
data has been erased from the dumpfile, and if so, displays an early
warning message alerting the user. The "help -n" command displays
the filter data strings that were used by makedumpfile.
(mahesh@linux.vnet.ibm.com)
(09/16/11)
5.1.7-2.fc17
- Fixes for gdb-7.0 ARM specific files to handle gcc-4.6 compiler
failures. Without the patch, gcc-4.6 generates "error: variable
‘<variable>’ set but not used [-Werror=unused-but-set-variable]"
fatal errors when the (default) -Werror flag is used. Previous gcc
versions considered local variables were simply set to some value to
be "used", but that is no longer the case.
(anderson@redhat.com)
Available in Fedora dist-rawhide branch:
build: crash-5.1.7-2.fc17
http://koji.fedoraproject.org/koji/buildinfo?buildID=261602
(09/01/11)
5.1.7-1.fc17
- Fixes for gdb-7.0 ppc64/ppc-specific files to handle gcc-4.6 compiler
failures. Without the patch, gcc-4.6 generates "error: variable
‘<variable>’ set but not used [-Werror=unused-but-set-variable]"
fatal errors when the (default) -Werror flag is used. Previous gcc
versions considered local variables were simply set to some value to
be "used", but that is no longer the case.
(anderson@redhat.com)
Available in Fedora dist-rawhide branch:
build: crash-5.1.7-1.fc17
http://koji.fedoraproject.org/koji/buildinfo?buildID=259184
(08/17/11)
5.1.7 - Fix for the x86_64 "bt" command in the highly-unlikely event that
a non-crashing CPU receives a NMI immediately after receiving an
interrupt from another source in a 2.6.29 and later kernels. In
those kernels, the IRQ entry-point symbols "IRQ0x00_interrupt"
through "IRQ0x##_interrupt" no longer exist, but the entry points
exist as memory locations starting at the symbol "irq_entries_start".
Without the patch, if a shutdown NMI interrupt gets received while in
one of the entry point stubs, "bt" will fail with the error message
"bt: cannot transition from exception stack to current process stack".
(anderson@redhat.com)
- The x86 and x86_64 "bt -e" and "bt -E" commands will display symbolic
translations of kernel-mode exception RIP values.
(anderson@redhat.com)
- Clarified two initialization-time CRASHDEBUG(1) messages to make it
obvious that the two linux_banner strings being compared originate
from the memory source or the kernel namelist file.
(anderson@redhat.com)
- Fix for the x86 "bt" command to handle cases where the shutdown NMI
was received when a task had just completed an exception, interrupt,
or signal handler, and was about to return to user-space. Without
the patch, the backtrace would be proceeded with the error message
"bt: cannot resolve stack trace", display the trace without the
kernel-entry exception frame, and then dump the text symbols found
on the stack and all possible exception frames.
(anderson@redhat.com)
- Fix for 2.6.33 and later kernels that are not configured CONFIG_SMP.
Without the patch, they fail during initialization with the error
message "crash: invalid structure member offset: module_percpu".
(nakayama.ts@ncos.nec.co.jp)
- Prepare for the imminent change in size of the vm_flags member of
the vm_area_struct to be 64-bits in size for all architectures now
that 32 bits have been consumed. The crash utility code had been
handling the older change of the vm_flags member from a short to a
long, but that would not account for the future change to a 64-bit
member on 32-bit architectures.
(anderson@redhat.com)
- Update of the "vm -f <flags>" option to the current upstream
state. Without the patch, only 23 of the currently-existing 32
bit flags were being translated.
(anderson@redhat.com)
- Fix for the "kmem -s", "kmem -S", "kmem -s <address>" and
"kmem <address>" command options if none of the NUMA nodes in
in a multi-node CONFIG_SLAB system have a node ID of 0. Without
the patch, "kmem -s" and "kmem -S" show all slab caches as if they
contain no slabs; if an <address> is specified, the correct slab
cache is found, but the command indicates "kmem: <slab-cache-name>:
address not found in cache: <address>".
(anderson@redhat.com)
- Cosmetic fix for the "kmem -[sS]" options if a CONFIG_SLAB kernel
slab cache contains 100000 or more slabs, or uses a slab size of
1 or more megabytes. Without the patch, the output utilizes more
than 80 columns.
(anderson@redhat.com)
- If a task was in user-space when a crash occurred, the user-space
registers are saved in per-cpu NT_PRSTATUS ELF notes in either
version 4 compressed kdump headers, or in dumpfile headers created
by the Fujitsu "sadump" facility. In that case, the "bt" command
will dump the x86 or x86_64 user-space register set.
(wency@cn.fujitsu.com)
- Fix for the x86 "bt" command to handle cases where the shutdown NMI
was received when a task had just received an interrupt, but before
it had created a full exception frame on the kernel stack and called
the interrupt handler. Without the patch, the backtrace would be
proceeded with the error message "bt: cannot resolve stack trace",
display the trace without the kernel-entry exception frame, and then
dump the text symbols found on the stack and all possible exception
frames.
(anderson@redhat.com)
- Fix for the x86 "bt" command to handle cases where the shutdown NMI
was received when a task was in the act of being switched to.
Without the patch, the backtrace would be proceeded with the error
message "bt: cannot resolve stack trace", display the trace without
the kernel-entry exception frame, and then dump the text symbols
found on the stack and all possible exception frames.
(anderson@redhat.com)
(07/12/11)
5.1.6 - Fixed several typos in the updated crash.8 man page.
(bob.montgomery@hp.com)
- Created a new "rd -a" option that displays printable ASCII data only,
starting from the specified location. If a "count" argument is not
entered, the display stops upon encountering the first non-printable
character.
(anderson@redhat.com)
- Fix for the "search -k" option on X86 kernels whose first memmap page
structure does not map to physical address 0. Without the patch, the
identity-mapped region of the first memory node would not be searched.
(anderson@redhat.com)
- Fix for the "search -k" option in the highly unlikely case of kernels
that have multiple NUMA nodes that are not sequential with respect to
their node IDs and the physical memory they reference, have physical
memory holes between any of the nodes, and do not have memmap page
structures referencing the non-existent inter-node physical memory.
In that event, it is conceivable that a NUMA node would be skipped.
(anderson@redhat.com)
- If the "kmem <address>" argument is a virtual address inside a
kernel module, the first item displayed is the address, followed by
its symbol type, and its symbol-name-plus-offset string. This patch
appends the module name in brackets, similar to what is displayed if
"sym <address>" is entered.
(anderson@redhat.com)
- Fix for "kmem -s <address>" in kernels configured with CONFIG_SLUB
and CONFIG_PAGEFLAGS_EXTENDED if the address is contained in a page
other than the first page in a compound, multi-page, slab. Without
the patch, the command would fail with the message "kmem: address is
not allocated in slab subsystem: <address>".
(anderson@redhat.com)
- Created a new "rd -N" option that displays 16- and 32-bit data in
network byte order, performing byte-swapping if appropriate.
(makc@gmx.co.uk)
- Fix for a compiler warning when building with "make warn". Without
the patch, kernel.c generates a message indicating "kernel.c: In
function ‘back_trace’:" followed by 17 messages indicating "kernel.c:
2187: warning: ‘btsave.<member>’ may be used uninitialized in this
function", where there is one message for each <member> of the
bt_info structure.
(anderson@redhat.com)
- Updated the #define of NR_SECTION_ROOTS to match its change upstream
that prevents its value from being calculated to be zero.
(takuo.koguchi.sw@hitachi.com, anderson@redhat.com)
- Fix for a double-free() in the unlikely event of a readmem() failure
in the ARM architecture's read_module_unwind_table() function.
(mika.westerberg@iki.fi)
- Updates to support CONFIG_SPARSEMEM for the ARM architecture.
(mika.westerberg@iki.fi)
- Extended the "mach" command to display the size and address of each
per-cpu IRQ stack and per-cpu exception stack, if they exist. This
extension is applicable to the x86_64 and ppc64 architectures, and
the x86 architecture if applicable. Prior to this patch, the values
were only accessible via "help -t" or "help -m".
(anderson@redhat.com)
- Created a new "kmem -o" option that dumps each cpu's offset value
that is added to per-cpu symbol values to translate them into kernel
virtual addresses. Prior to this patch, the values were only
accessible via "help -k".
(anderson@redhat.com)
- Removed the "kmem [-[l|L][a|i]]" options from being advertised by
the "kmem" help page; the options have been obsolete since the Linux
version 2.2 timeframe.
(anderson@redhat.com)
- Fix to support Linux 3.x version number change. Without the patch,
the crash session fails with kernel version 3.0 and later, displaying
the message "WARNING: kernel version inconsistency between vmlinux
and [live memory or dumpfile]", followed by the fatal error message
"crash: incompatible arguments: vmlinux is not SMP -- [live system or
dumpfile] is SMP".
(sebott@linux.vnet.ibm.com, anderson@redhat.com)
- Updates to the sial.c extension module to support the Linux 3.x
version number change.
(sebott@linux.vnet.ibm.com, Luc.Chouinard@trueposition.com)
- Created a new "kmem -g [flags]" option that displays the enumerator
value of bits in the page structure's "flags" field. With no "flags"
argument, the enumerator value of all bits are displayed; when a
hexadecimal "flags" option is added, just the bits in the value are
translated. This option only works with 2.6.26 and later kernels,
which contain the "enum pageflags".
(anderson@redhat.com)
(06/07/11)
5.1.5 - Fix to allow a vmlinux.bz2 file to be accepted when it is part of
a relative or absolute pathname. Without the patch, the file is
rejected with the message "crash: <path-to>/vmlinux.bz2: not a
supported file format", although it is still possible to use it with
the "-f" flag.
(d.hatayama@jp.fujitsu.com)
- Fix for the usage of a vmlinux.gz or vmlinux.bz2 file if the
relevant gunzip or bunzip2 file is not located in /usr/bin.
Without the patch on an Ubunutu system, the uncompression fails
because those binaries are only located in the /bin directory.
Also fixed the uncompression error message to differentiate
between gunzip and bunzip2.
(anderson@redhat.com)
- Created a new exist_regs_in_elf_notes() function for extension
modules to pre-determine whether an ELF note containing registers
exists for a specified task. The function is also used by the
currently-existing get_regs_from_elf_notes() function to clean up
redundant code in the various get_<arch>_regs_from_elf_notes()
functions that it calls.
(d.hatayama@jp.fujitsu.com)
- Exported the formerly static x86_64_exception_frame() function
to extension modules, and created a new EFRAME_INIT flags argument
that directs the function to fill in the x86_64 pt_regs offset table
and return any errors encountered in doing so.
(anderson@redhat.com)
- Created and exported a new get_kvm_register_set() interface for
extension modules to get a copy of the per-cpu registers stored in
the kvmdump header.
(anderson@redhat.com)
- Fix for the handling of x86_64 compressed kdump dumpfiles where
the crashing system contained more than 454 cpus. Without the
patch, the crash session fails during initialization with the error
message "crash: compressed kdump: invalid nr_cpus value: <cpus>"
followed by "crash: vmcore: not a supported file format".
(tindoh@redhat.com, tachibana@mxm.nes.nec.co.jp)
- Fix for the "uvtop" and "vm -p" commands when run on tasks that
have performed an mprotect(PROT_NONE) on a user-space page. Because
the PAGE_PRESENT bit is not set in that case, the page was presumed
to be swapped out. Without the patch the "vtop <address>" command
fails with the error message "vtop: cannot determine swap location",
and "vm -p" indicates "SWAP: (unknown swap location)" when iterating
over the page.
(d.hatayama@jp.fujitsu.com)
- Fix for the use of the "-g vmlinux" command line option by non-root
users if the /dev/crash module has been preloaded. Without the
patch, after the vmlinux file's debugging information has been
shown, the error messages "ERROR: Removing 'crash': Operation not
permitted" and "NOTE: cleanup_memory_driver failed" are displayed.
(anderson@redhat.com)
- Fix for the s390x "bt" command to handle a program check interrupt
while operating on the process stack. Without the patch, the
backtrace stops prematurely upon reaching the pgm_check_handler()
interrupt handler.
(holzheu@linux.vnet.ibm.com)
- Long overdue rewrite of the crash.8 man page and the associated
"crash -h" built-in usage display. The crash.8 man page clarifies
the required invocation options, adds all of the rarely-used
command line options that have proliferated over the years, and
updates the ENVIRONMENT variables section. The "crash -h" output
closely mimics the relevant parts of the crash.8 man page.
(anderson@redhat.com)
- Fix for the embedded gdb module to determine member offsets of the
pglist_data structure when the kernel was compiled with gcc 4.6.0.
Without the patch, the system MEMORY size shown by the initial system
data and by the "sys" command is nonsensical, the "kmem -n" command
shows faulty memory node data, and if the kernel is configure with
CONFIG_SLUB, "kmem -[sS]" will fail with numerous "kmem: page_to_nid:
cannot determine node for pages: <page-address>" errors. There
may be other ramifications given that the pglist_data structure is
crucial to the functionality of the crash utility.
(tromey@redhat.com)
- Implemented the capability of using the NT_PRSTATUS ELF note data
that is saved in version 4 compressed kdump headers to determine the
starting stack and instruction pointer hooks for x86 and x86_64
backtraces when they cannot be determined in the traditional manners.
(wang.chao@cn.fujitsu.com, wency@cn.fujitsu.com)
- Added a new "--osrelease <dumpfile>" command line option that
displays the OSRELEASE vmcoreinfo string from a kdump dumpfile.
(anderson@redhat.com)
- Fix to recognize the per-cpu symbol name change from "cpu_info"
to "ia64_cpu_info" in 2.6.33 and later ia64 kernels. Without the
patch, the message "WARNING: cannot find cpuinfo_ia64 location"
would appear during invocation, and the "mach -c" command would
fail in a similar manner, indicating "mach: cannot find cpuinfo_ia64
location".
(anderson@redhat.com)
- Fix for "kmem -[sS]" command on 2.6.39 kernels where the original
slab structure members have been moved into an anonymous union.
Without the patch, either command fails immediately with the error
message "kmem: invalid structure member offset: slab_list".
(anderson@redhat.com)
(05/11/11)
5.1.4 - Fix for RT kernels in which the schedule() function has become a
wrapper function that calls the __schedule() function, and where
other functions may call __schedule() directly. Without the patch,
a warning message indicating "crash: cannot determing thread return
address" is displayed during invocation on x86_64 machines, and
backtraces of blocked tasks may have missing or invalid frames.
(anderson@redhat.com)
- Fix for running against live x86 kernels that were configured with
CONFIG_PHYSICAL_START containing a value that is greater than its
CONFIG_PHYSICAL_ALIGN value, and where the first symbol listed by
/proc/kallsyms is not "_text". Without the patch, the crash session
fails during invocation with the error message "crash: vmlinux and
/dev/mem do not match!" (or "/dev/crash" if applicable). As a work-
around, "/proc/kallsyms" can be entered on the command line, or the
"--reloc=<size>" option could be used, but the fix obviates that
requirement for live systems. It should be noted that dumpfiles of
kernels configured that way still do require that "/proc/kallsyms",
or a copy of it, or alternatively the "--reloc=<size>" option, to
be entered on the command line, as detailed in this changelog entry:
https://crash-utility.github.io/crash.changelog.html#4_0_4_5
(anderson@redhat.com)
- Unlike other extension modules, the "sial.so" module must be built
within a pre-built crash source tree because it uses header files
from the embedded gdb module. Therefore if a crash source tree is
laid down, entered, and "make extensions" is entered without first
building the crash utility, the build of sial.so build spews numerous
error messages. To avoid that, the sial.mk file has been modified to
check whether the embedded gdb build has been completed, and if it
has not, just displays "sial.so: build failed: requires the crash
gdb-7.0 module".
(anderson@redhat.com)
- If an extension module does not have its own <module>.mk file,
and is built using the extensions/Makefile, then it will be compiled
with the -Wall flag.
(anderson@redhat.com)
- The "trace.so" extension module has been improved to use "trace.cmd"
to implement the "trace show" option, instead of maintaining a
redundant code base within the module itself. The trace-cmd command
is better, mature, and continually maintained. The new "trace show"
option works like so:
(1) builds trace.dat from the core file and dumps it to /tmp.
(2) execs "trace-cmd report" upon the trace.dat file.
(3) splices the output of trace-cmd to the user and unlinks the
temporary file.
(laijs@cn.fujitsu.com)
- Updates to the "trace.so" extension module to extract trace_bprintk()
formats from a kernel core dump. It handles both the current format
and a new format that will be pushed out after the merge window has
closed for Linux 2.6.40. The new format is required for the kernel
debugfs to export the same bprintk data as well. This means that the
trace.so extension module will be able to extract more information
than trace-cmd itself can on a running kernel.
(rostedt@goodmis.org)
- Fix for the "gdb" command, or any command that resolves to a gdb
command, to not strip quotation marks from the input line. Without
the patch, any gdb command whose arguments contain quotation marks,
(e.g. "printf") would fail because they get incorrectly stripped
from the input line.
(anderson@redhat.com)
- Fix for the "p" command if its symbolic argument is a "char *" that
points to a static data string containing an "%" character. Without
the patch, the command results in a segmentation violation.
(anderson@redhat.com)
- Fix for the "sys -c" option to display an error message if a known
sys_call_table entry is not a valid system call address. Without
the patch, the compromised system call entry is not displayed unless
the crash debug mode is set to 1 or greater. With the patch, the
system call number will be followed by an error message indicating
"invalid sys_call_table entry: <address> (<symbol-name>)". This
change is only applicable on architectures/kernels where the index of
the sys_call_table array can be confirmed by debuginfo data, i.e.,
is not a loose calculation based upon the next kernel symbol.
(anderson@redhat.com)
- Print a warning message if there is any inconsistency between the
kernel version strings found in the vmlinux file vs. the dumpfile
or live memory. If a System.map file is used to correct the virtual
addresses found in the vmlinux file, the message is not displayed.
(anderson@redhat.com)
- Fix for "kmem -v", and all other commands that search through the
kernel's mapped virtual address list, in x86_64 kernel versions from
2.6.0 to 2.6.11. Those kernels contained a "vmlist" and a separate
"mod_vmlist" list header, both of which point to list of vm_structs
that described each contiguous block of mapped kernel memory. 2.6.12
and later x86_64 kernels consolidated both lists onto the "vmlist".
Without the patch, the list headed by "mod_vmlist" was not searched.
(anderson@redhat.com)
- Clarify the "struct -l offset" option so that it does not imply that
the address argument is necessarily an embedded list_head pointer.
The "-l offset" option essentially provides the capability of the
kernel's container_of() macro, such that the address of an embedded
data structure can be used to display its containing data structure.
- Clarify the help page documentation for the "struct -l offset" option
so that it does not imply that the address argument is necessarily
an embedded list_head pointer. The "-l offset" option essentially
provides the capability of the kernel's container_of() macro, such
that the address of an embedded data structure can be used to display
its containing data structure
(anderson@redhat.com)
(03/30/11)
5.1.3 - Implemented support for using vmlinux files that have been compressed
with either gzip or bzip2. For examples:
# crash vmlinux.gz vmcore
# crash vmlinuz.bz2
The uncompressed file will be temporarily stored either in /var/tmp
or in the directory specified in a TMPDIR shell environment variable.
The compressed filename must at least begin with "vmlinux" so as to
avoid any attempt to uncompress a vmcore file. Gzip'd vmlinux files
are preferable since the uncompress operation is less time-consuming.
(anderson@redhat.com)
- Prevent an unnecessary warning message that was introduced in version
5.1.0 that indicates "WARNING: cannot read .debug_frame data from
<namelist>" when running against vmlinux executables that have a
separate ".debug" debuginfo file, such as RHEL3 vmlinux kernels.
With the patch, the message is only printed if CRASHDEBUG(1).
(anderson@redhat.com)
- Fix for the x86_64 "bt" command if the shutdown NMI is issued to a
32-bit task that has executed a "sysenter" instruction and the RSP
still contains the zero value loaded from the MSR_IA32_SYSENTER_ESP
register. Without the patch, the backtrace issues a warning message
indicating "WARNING: possibly bogus exception frame", and is unable
to make a transition from the NMI exception stack.
(anderson@redhat.com)
- Fixes for the gdb-7.0 sources to address gcc-4.6 compile failures.
Without the patch, gcc-4.6 generates "error: variable ‘<variable>’
set but not used [-Werror=unused-but-set-variable]" fatal errors when
the (default) -Werror flag is used. Previous gcc versions considered
local variables were simply set to some value to be "used", but that
is no longer the case.
(anderson@redhat.com)
- Fixes for the top-level crash sources to address gcc-4.6 compiler
warnings or errors. Without the patch, building with gcc-4.6 would
generate numerous "error: variable ‘<variable>’ set but not used
[-Werror=unused-but-set-variable]" errors or warnings, depending upon
whether "make warn" or "make Warn" was used.
(anderson@redhat.com)
- Removed -Wp,-D_FORTIFY_SOURCE=2 from the WARNING_OPTIONS string due
to a memmove() oddity seen when using it in conjunction with -O2
with gcc-4.6.
(anderson@redhat.com)
- Implemented three new options for the "search" command. This patch
adds the -c option to search for character strings, the -w option to
search for unsigned hexadecimal integer values, and -h to search for
unsigned hexadecimal short values. The integer and short values are
searched on integer and short alignments respectively. The -w option
is only meaningful on 64-bit systems, to be used in order to search
both the upper and lower 32-bits of each 64-bit long for the 32-bit
value. Strings are searched across contiguous page boundaries, where
the page boundaries being crossed are relevant to the memory type
being searched, i.e., kernel virtual, user virtual, or physical
memory.
(bob.montgomery@hp.com)
- Restrict the new "search -p" option to physical memory pages that
have an mem_map page structure assigned to them.
(anderson@redhat.com, bob.montgomery@hp.com)
- Hardwire the declaration of the user_regs_struct in x86_64.c for
kernels whose debuginfo data does not contain it.
(wency@cn.fujitsu.com)
- Fix for compiler warnings when building makedumpfile.c and memory.c
with "make warn" on 32-bit systems.
(anderson@redhat.com)
- Fix to more correctly determine the KVM I/O hole size and location.
The I/O hole size to this point in time is either 1GB or 512MB, but
its setting is hardwired into the Qemu code that was used to create
the dumpfile. The dumpfile is a "savevm" file that is designed to be
used for guest migration, and since inter-version save/load is not
supported, the I/O hole information does not have to encoded into the
dumpfile. Without the patch, the I/O hole for dumpfiles created by
older Qemu version was not being set to 1GB, so if the KVM guest was
configured with more than 3GB of memory, the crash session would
typically display numerous "read error" messages during session
initialization.
(anderson@redhat.com)
- Fix for the x86 "bt" command on RHEL6 kernels that contain a backport
of upstream commit a00e817f42663941ea0aa5f85a9d1c4f8b212839, which
moved x86 irq-exit functions to a special .kprobes.text section.
Without this patch, "bt" would show nonsensical backtraces that begin
and end with the "ia32_sysenter_target" function, and would dump an
invalid kernel-entry exception frame.
(anderson@redhat.com)
- Fix for the x86 "bt" command to fix a possible failure to backtrace
a non-active "swapper" task. Without the patch, the backtrace would
fail with the error message "bt: cannot resolve stack trace".
(anderson@redhat.com)
- Fix for the x86 "bt" command to prevent the display of a stale
interrupt exception frame left on the stack of a non-active task.
(anderson@redhat.com)
(03/09/11)
5.1.2-2 - Fixes for the gdb-7.0 sources to address gcc-4.6 compile failures of:
gdb-7.0/bfd/elf64-x86-64.c
gdb-7.0/bfd/elf.c
gdb-7.0/bfd/elf-eh-frame.c
gdb-7.0/bfd/elf32-i386.c
gdb-7.0/bfd/aoutx.h
gdb-7.0/bfd/peXXigen.c
gdb-7.0/bfd/archive64.c
gdb-7.0/opcodes/i386-dis.c
Without the patch, gcc-4.6 generates "error: variable ‘<variable>’
set but not used [-Werror=unused-but-set-variable]" fatal errors when
the -Werror flag is used. Previous gcc versions considered local
variables were simply set to some value to be "used", but that is no
longer the case.
(anderson@redhat.com)
Available in Fedora Rawhide devel branch:
build: crash-5.1.2-2.fc16
http://koji.fedoraproject.org/koji/buildinfo?buildID=230817
Available in Fedora 15 branch:
build: crash-5.1.2-2.fc15
http://koji.fedoraproject.org/koji/buildinfo?buildID=230818
(02/25/11)
5.1.2 - Added the /dev/crash memory driver Makefile and source file to
the crash package for live system analysis when the target
system's kernel was configured with CONFIG_STRICT_DEVMEM, or was
not configured with CONFIG_PROC_KCORE, or whose /proc/kcore is
simply not functional. The driver can be built and installed by
entering the memory_driver subdirectory and entering "make", and
then "insmod crash.ko". If the module is successfully installed,
it will be used by default for live crash sessions.
(anderson@redhat.com)
- Fix for the "extend -u" command option. Without the patch, after
the successful unloading of a crash extension module, there may be
an invalid error message that indicates "extend: <module>.so and
<module>.so are different".
(anderson@redhat.com)
- Implement support for Xen version 4 hypervisor dumpfiles:
1. Accept the "__per_cpu_shift" symbol. Without the patch, crash
initialization fails on X86, X86_64 and IA64 dumpfiles.
2. Make the x86_64 XEN_VIRT_START value a variable dependent upon
the Xen version. Without the patch, crash initialization fails
on x86_64 dumpfiles.
3. In Xen version 4, "init_tss" is a per-cpu symbol. Without this
patch, crash fails during initialization with the error message
"crash: cannot resolve init_tss" on X86 and X86_64 dumpfiles.
4. Each domain can have a different number of max VCPUs in Xen
version 4. Prepare for this by converting the static array into
a dynamic one.
5. If the size of the vcpu array in struct domain is known, use it
to size the dynamically allocated vcpu array in crash. This
enables crash to initialize domains with a different number of
VCPUs than specified by the XEN_HYPER_MAX_VIRT_CPUS macro.
6. The "vcpu" field changed from a fixed array to a pointer to an
array. The size of the array is stored in the (newly introduced)
"max_vcpus" field. Modify xen_hyper_store_domain_context() to
account for this change.
7. The command line options (such as opt_sched) are discarded after
boot in Xen version 4, so they are no longer available. Use the
"ops" variable (if it exists) to determine the active scheduler.
Without this patch, crash fails during initialization with the
error message "crash: schedule data not found".
(ptesarik@suse.cz)
- Created a new extension module API:
int load_module_symbols_helper(char *module);
It takes a kernel module name as an argument, and performs the same
procedure as the command "mod -s <module>".
(nakayama.ts@ncos.nec.co.jp, anderson@redhat.com)
- When loading debuginfo data for kernel module object files with the
"mod" command or with the new load_module_symbols_helper() extension
module API, search a non-standard directory path by specifying the
directory tree in the CRASH_MODULE_PATH shell environment variable.
(nakayama.ts@ncos.nec.co.jp, anderson@redhat.com)
- Created a new extension module API:
int get_kernel_config(char *conf_name, char **str);
The kernel must be configured with CONFIG_IKCONFIG. It takes a
kernel configuration item string, which may be of the format
"CONFIG_XXX" or just "XXX", and a pointer to a char *. The function
returns:
IKCONFIG_Y: configuration item is built into the kernel
IKCONFIG_M: configuration item is part of a kernel module
IKCONFIG_STR: configuration item consists of a string
IKCONFIG_N: the kernel is not configured with CONFIG_IKCONFIG or
the configuration item is not configured.
If "Y", "M" or "STR", the configuration item's string representation
will be pointed to by the passed-in "str" pointer.
(nakayama.ts@ncos.nec.co.jp, anderson@redhat.com)
- Update of the "extensions/trace.c" extension module. Initially
designed to support 2.6.32 (RHEL6), it has been updated to support
kernels up to 2.6.38-rc1.
(laijs@cn.fujitsu.com)
- Fix for the internal parse_line() function to properly handle
multiple string arguments. Without the patch, the second, fourth,
etc., string argument would get broken up into individual tokens.
(bob.montgomery@hp.com)
- Updates to support ARM page table changes and PTE differences that
were introduced in 2.6.38 kernels. Deleted two irrelevant ARM-only
source code comments.
(ext-mika.1.westerberg@nokia.com)
- Removed two unused "struct syment" local variable declarations in
the gdb_add_symbol_file() function in gdb-7.0/gdb/symtab.c.
(ptesarik@suse.cz)
- Replaced the usage of the VOID_PTR() macro with the ULONG() macro for
structure member accesses in vm_area_dump(), vm_area_page_dump() and
next_upage().
(ptesarik@suse.cz)
- Implemented support for makedumpfile's "vmcore.flat" dumpfile format.
It is no longer necessary to revert the flat dumpfile back into an
ELF vmcore or compressed kdump vmcore with "makedumpfile -R", or with
the "makedumpfile-R.pl" script. Without the patch, attempting to use
a flat dumpfile fails with the message "crash: vmcore.flat: not a
supported file format".
(oomichi@mxs.nes.nec.co.jp)
- Moved the GDB_CONF_FLAGS logic from the Makefile into configure.c.
(ptesarik@suse.cz)
- Use memset() in the shift_string_right() utility function, and use
shift_string_right() and memset() in the mkstring() utility function.
(ptesarik@suse.cz)
- Added a new "search -p" option to search physical memory. Until now
only kernel virtual memory and user virtual memory of the current
context could be searched.
(anderson@redhat.com)
- Optimization of the "search -k" function resulting in a significant
time reduction when cycling through vmalloc memory space.
(anderson@redhat.com)
- Added a new "search -K" option, which searches a subset of kernel
virtual memory, restricting the search to kernel unity-mapped virtual
memory, and on the x86_64 and ia64 machines, their discretely mapped
kernel-text/static-data regions.
(anderson@redhat.com)
- Added a new "search -V" option, which searches a subset of kernel
virtual memory, restricting the search to kernel virtual memory
that was allocated by vmalloc(), kernel module memory, and the
virtual mem_map region if it exists on x86_64, ia64, ppc64 and
s390x machines.
(anderson@redhat.com)
- Reworked the functionality of the "search" command to break up kernel
virtual memory into machine-dependent address regions so that the
cycling through disparate virtual address regions can be done in a
more efficient manner.
(anderson@redhat.com)
- Fix for "search -k" on x86_64. Early 2.6 kernels did not have their
module virtual address space included in the kernel's "vmlist" array;
without the patch, module memory is not searched.
(anderson@redhat.com)
- Fix for "search -k" on x86_64. Without the patch, the kernel's
mapped kernel text/static-data region is not searched as a separate
region.
(anderson@redhat.com)
- Fix for "search -k" on architectures that may have a virtual memmap,
such as x86_64, ia64, ppc64 and s390x. Without the patch, virtual
memmap array memory is not searched.
(anderson@redhat.com)
- Fix for "search -k" on s390x. Without the patch, the command may
end prematurely with the error message "search: read error: kernel
virtual address: 0 type: entry".
(anderson@redhat.com)
- Fix for "search" command to prevent unnecessary read error messages
for memory pages that are not contained in a xendump dumpfile.
Without the patch, the output may be interspersed with messages that
indicate "search: cannot find mfn in page index"
(anderson@redhat.com)
- Fix for a compile failure of the embedded gdb-7.0/bfd/verilog.c file
due to a smarter -Werror flag used by gcc-4.6.
(anderson@redhat.com)
(02/10/11)
5.1.1 - Fix for the potential to miss tasks when walking the pid_hash table
in 2.6.24 and later kernels. Without the patch, the task will simply
not be seen in the gathered task list.
(holzheu@linux.vnet.ibm.com)
- Enhancement for the ARM architecture's "bt" command to print out
the user space register set for tasks entering the kernel via the
syscall entry point.
(ext-mika.1.westerberg@nokia.com)
- Rework of the handling of "set" commands that are put in .crashrc
files so that only the following options are resolved prior to
session initialization: silent, console, core, edit, namelist, and
zero_excluded. All others are resolved immediately after session
initialization is complete. Accordingly, the use of "set -c <cpu>",
"set -p", "set -a [task|pid]" and set [pid|task]" are now acceptable
.crashrc commands.
(anderson@redhat.com)
- The entering of "set -v" in a .crashrc file would cause an immediate
segmentation violation. The "set" command rework above defers the
command until session initialization is complete.
(anderson@redhat.com)
- The entering of "set dumpfile <filename>" in a .crashrc file would
cause a fatal "seek" error during session initalization with most
most dumpfile types, so the "dumpfile" option has been removed from
the "set" command.
(anderson@redhat.com)
- The execution of "alias" commands from a .crashrc file used to be
performed immediately; that behavior has been changed so that they
are executed immediately after session initialization is complete.
(anderson@redhat.com)
- Enhancement of the "repeat" command to allow command aliases.
(anderson@redhat.com)
- Fix for running "kmem -s" on a live system if an offline cpu is
brought back online while the command is executing. Without the
patch, the online operation may cause a segmentation violation.
(anderson@redhat.com)
- Change the behavior of "bt -[tT]" to allow the command options
to be run on active tasks on live systems. Without the patch,
both command options would display the task data banner followed
by "(active)".
(anderson@redhat.com)
- Fix for the ARM architecuture's "irq" command when run on 2.6.36
and later kernels. Without the patch, the command fails with the
error message "irq: invalid kernel virtual address: 23 type:
irq_chip typename". The fix replaces the custom ARM IRQ dumping
function with the architecture-neutral version.
(ext-mika.1.westerberg@nokia.com)
- Introduced support for using /proc/kcore as an alternative source of
live memory to /dev/mem. Doing so allows vmalloc memory access on
32-bit architectures when the underlying mapped physical memory is in
highmem, which is not allowed by the /dev/mem driver. It would also
be usable on systems that are configured with CONFIG_STRICT_DEVMEM
but still configured with CONFIG_PROC_KCORE. To enforce the use of
/proc/kcore, it may be entered on the command line.
(anderson@redhat.com)
- If a live crash session attempts to use /dev/mem as a live memory
source, and it is determined that the system is configured with
CONFIG_STRICT_DEVMEM, /proc/kcore will automatically be tried as
an alternative.
(anderson@redhat.com)
- Fix to allow "/dev/crash" to be entered on the command line for live
sessions. Because it is used automatically if it exists, it is never
necessary to enter it on the command line. However, if it is used,
without the patch, the session fails during initializaion with the
error message "crash: /dev/crash: No such file or directory" if the
crash.ko driver is a module (RHEL4/RHEL5), or "crash: /dev/crash:
not a supported file format" if the driver is built into the kernel
(RHEL6).
(anderson@redhat.com)
- Fix for the ARM "bt" command to address the issue behind faulty
warning messages that indicate "WARNING: UNWIND: unsupported
personality routine".
(ext-mika.1.westerberg@nokia.com)
- Fix for the ARM "bt" command to address the issue behind faulty
warning messages that indicate "bt: WARNING: UNWIND: cannot find
index for <address>".
(ext-mika.1.westerberg@nokia.com)
(12/23/10)
5.1.0 - Fix for the x86 "bt" command for the active, non-crashing, tasks
in 2.6.31 and later KVM dumpfile kernels that are not configured
with CONFIG_4KSTACKS. Without the patch, the exception frame
generated by the reboot_interrupt() entry point is not displayed,
and if the task was running in user space, the command would
generate a "bt: cannot resolve stack trace" error message.
(anderson@redhat.com)
- Ksplice Inc. proposed a patch which added the module name to x86 and
x86_64 "bt" frame displays so that it would be readily evident that
a kernel function had been replaced by a ksplice module function.
Given that the functionality is useful for the frame display of any
kernel module function, it has been extended to be used in the "bt"
output of the other architectures, as well as in the output of the
"bt -[tT]" options.
(nelhage@ksplice.com, anderson@redhat.com)
- Enhance the "sym" command to display the containing module name
name in brackets (if applicable) when entering a virtual address,
symbol name, or symbol query argument.
(anderson@redhat.com)
- Implemented support for the recognition and display of module per-cpu
symbols after they have been loaded by the "mod -[sS]" command.
Without the patch, any per-cpu symbols declared by a module were not
recognized or displayed at all. With the patch, they are displayed
in the module's symbol list, and as is the case with base kernel
per-cpu symbols, they can be the target of the "p" command in order
to show their type and per-cpu virtual addresses.
(nakayama.ts@ncos.nec.co.jp, anderson@redhat.com)
- Fix for the x86 "bt" command to properly handle a NMI-interrupted
idle task running in cpu_idle(). Without the patch, the backtrace
indicated "bt: cannot resolve stack trace" even though it had resolved
the trace correctly.
(anderson@redhat.com)
- Implemented support for s390x compressed kdump dumpfiles created
by the makedumpfile facility.
(mahesh@linux.vnet.ibm.com)
- Fix for the "bt" command on x86 Xen hypervisor dumpfiles where a vcpu
received a shutdown NMI while running in the event_check_interrupt()
interrupt handler. Without the patch, the backtrace would indicate
"bt: cannot resolve stack trace", and dump the text symbols on the
stack. The patch recognizes all hypervisor entry points at the
top of the vcpu stack.
(anderson@redhat.com)
- Fix for the "bt" command on x86 Xen hypervisor dumpfiles where a vcpu
received a shutdown NMI while running in the hypercall entry point,
but its return address on the stack gets perceived as an assembly
label symbol within the hypercall code. Without the patch, the
backtrace would indicate "bt: cannot resolve stack trace", and dump
the text symbols on the stack. The patch replaces the assembly
label symbol name with "hypercall".
(anderson@redhat.com)
- Fix for the "help -n" output for s390x ELF vmcore dumpfiles to
recognize the EM_S390 e_machine value, the NT_FPREGSET n_type,
and the new NT_S390_TIMER, NT_S390_TODCMP, NT_S390_TODPREG,
NT_S390_CTRS and NT_S390_PREFIX n_types. Without the patch
the e_machine field showed "(unsupported)", and the n_types
showed "(?)".
(anderson@redhat.com)
- Fix for the "help -n" output for s390x ELF vmcore dumpfiles to
properly dump the contents of the descriptor data of each Elf64_Nhdr
note. Without the patch, the pointer to the descriptor data was
incorrectly calculated and the resultant data output was "shifted".
(anderson@redhat.com)
- Fix for the "help -n" output for diskdump and compressed kdump
files to show the file name as stored in the per-file diskdump_data
structure. Without the patch, only "split" dumpfiles displayed their
individual dumpfile names, whereas single dumpfiles showed "(null)".
(anderson@redhat.com)
- Resurrection of the "irq -b" command option for 2.6 kernels.
(anderson@redhat.com)
- Fix for the displaying of data generated from shell-escaped commands
when the data contains a "%" character followed by a conversion
character. Without the patch, a segmentation violation may occur
when the a conversion gets attempted by fprintf().
(anderson@redhat.com)
- Reworked the do_radix_tree() utility function to work without
depending upon a hardwired copy of the kernel's radix_tree_node
structure, and changed the RADIX_TREE_MAP_SHIFT, RADIX_TREE_MAP_SIZE
and RADIX_TREE_MAP_MASK #define's into dynamically calculated values.
(anderson@redhat.com)
- Call FREEBUF() on a GETBUF()-generated buffer in the do_radix_tree()
utility function.
(wang.chao@cn.fujitsu.com)
- Store the .debug_frame section offset and size from the vmlinux file,
and use its data as an alternative to the .eh_frame section data in
the x86_64 unwind code.
(wang.chao@cn.fujitsu.com)
- Fix for the "irq" command when run on 2.6.29 kernels, which declared
the irq_desc_ptrs as a static array indexed by NR_IRQS. Without the
patch, the command would show nonsensical IRQ data or fail with the
error message "irq: invalid kernel virtual address: <address> type:
hw_interrupt_type typename".
(anderson@redhat.com)
- Fix for the "irq" command to run with 2.6.34 or later kernels that
replaced the array of irq_desc structures or irq_desc pointers with
a radix tree. Without the patch, the command would fail with the
error message "irq: x86_64_dump_irq: irq_desc[] does not exist?".
(anderson@redhat.com)
- As of 2.6.37, the output of the "irq" command will change from the
current manner of displaying a few cherry-picked structure members
that are of questionable usefulness and a nightmare to maintain. The
new scheme displays the address of the irq_desc/irq_data structure,
and a list of one or more associated irqaction structures and their
name string. With that information, it is simple matter to ascertain
any other desired data concerning the IRQ.
(anderson@redhat.com)
(11/30/10)
5.0.9 - Make the symbol_search_next() function externally available to
extension modules, as requested for the "pykdump" extension module.
(anderson@redhat.com)
- Fix for the "log" command to recognize that the "log_end" symbol
was changed from an unsigned long to an unsigned int in 2.6.25 and
later kernels.
(anderson@redhat.com)
- Fix to determine the size and location of the x86_64 interrupt stack
on kernels that are not configured CONFIG_SMP. Without the patch,
runtime commands that use the embedded gdb module may fail with the
error message "<segmentation violation in gdb>".
(anderson@redhat.com)
- Suppress the "crash -d1" initialization-time message that indicates
"WARNING: Because this kernel was compiled with gcc version <x.x.x>,
certain commands or command options may fail unless crash is invoked
with the --readnow command line option" to only be displayed with
kernels compiled with gcc versions between 3.4.0 and 4.0.0.
(anderson@redhat.com)
- Fix for the "bt" command on 2.6.33 and later x86_64 kernels, which
contain debuginfo data for "struct user_regs_struct", and where the
the dumpfile is a kdump ELF vmcore. Without the patch, the backtrace
for the panic task uses the registers found in the ELF header's
NT_PRSTATUS note as starting hooks, which causes the backtrace to be
essentially truncated, leaving out the exception frame, the exception
handler's frame, and so on, down to the kdump operation. The patch
will only use the ELF header's registers if better starting hooks
cannot be determined.
(anderson@redhat.com)
- Fix for handling KVM dumpfiles that contain "devices" that are not
explicitly supported. The patch skips over the unsupported/unused
device segment in the dumpfile, and searches for the next "known"
device contained in the supported device table. Without the patch,
the crash session fails during initialization with the error message
"crash: <dumpfile>: initialization failed".
(anderson@redhat.com)
- When handcrafting the backtrace starting point for the "bt" command
by using the -S option, and the starting stack address is not in
the task's process stack or in a legitimate non-process stack
address, such as a hard or soft IRQ stack address, or an x86_64
exception stack address, a message gets displayed that indicates
"non-process stack address for this task". Without the patch, the
backtrace is still attempted, which may result in a segmentation
violation, so this behavior has been changed such that the "bt"
command will fail immediately.
(anderson@redhat.com)
- Modified the help page for the "help" command to also show the
various crash-internal debug options available.
(anderson@redhat.com)
- Fix for the x86_64 "bt" command to more correctly find the starting
backtrace RIP and RSP hooks in KVM dumpfiles. Without the patch,
backtraces that should start in the interrupt or exception stacks
were not being detected correctly.
(anderson@redhat.com)
- Save the per-cpu register contents stored in the "cpu" devices of
x86_64 KVM dumpfiles, and use their contents for x86_64 backtrace RSP
and RIP hooks in the case of KVM "live dumps" where the guest system
was not in a crashed state when the "virsh dump" operation was done
on the KVM host. If an active task was running in user space when
a live dump was taken, that will be indicated by the "bt" output,
along with the user-space register contents. The x86_64 register set
saved for each cpu may be displayed with the "help -[D|n]" command.
(hutao@cn.fujitsu.com, anderson@redhat.com)
- Fix for the cpu count determination in crashed x86 KVM dumpfiles,
where the non-crashing cpus are marked offline in the kernel's
cpu_online_mask by smp_stop_cpu(). Depending upon the cpu number
of the crashing task, the cpu count may be set to a value that is
less than the number of present cpus.
(anderson@redhat.com)
- Fix for a premature failure of the "kmem -i" command with kernels
that are not configured with CONFIG_SWAP.
(per.xx.fransson@stericsson.com)
- Fix for the x86 "bt" command on 2.6.31 and later kernels when the
crash was generated by an "echo c > /proc/sysrq-trigger". Without
the patch, the backtrace does not display the exception frame from
the forced oops. This is not applicable to older kernels where
crash_kexec() is called directly from sysrq_handle_crash(), or if
an actual alt-sysrq-c keystroke sequence is entered.
(anderson@redhat.com)
- Fix for the x86 "bt" command to correctly find the starting backtrace
EIP and ESP hooks for the active tasks in KVM dumpfiles where the
kernel had crashed.
(anderson@redhat.com)
- Fix to utilize the correct "cpu" device format in x86 KVM dumpfiles
Without the patch, the x86 registers were read in a 32-bit format,
which is only true if the host machine was running a 32-bit kernel.
With the patch, the format defaults to the 64-bit format, and is
switched to the 32-bit format if it can be determined that the host
machine was running a 32-bit kernel.
(hutao@cn.fujitsu.com, anderson@redhat.com)
- Save the per-cpu register contents stored in the "cpu" devices of
x86 KVM dumpfiles, and use their contents for x86 backtrace ESP and
EIP hooks in the case of KVM "live dumps", i.e., where the guest
system was not in a crashed state when the "virsh dump" operation
was done on the KVM host. If an active task was running in user
space when a live dump was taken, that will be indicated by the
"bt" output, along with the user-space register contents. The saved
x86 register set for each cpu may also be displayed with the
"help -[D|n]" command.
(hutao@cn.fujitsu.com, anderson@redhat.com)
- Update for the KVM-only "map" command to also store the register sets
read from the the KVM dumpfile's "cpu" devices in addition to the
mapfile data when it is written to an external mapfile, or appended
to the dumpfile, so that subsequent sessions will not require the
initial scan of the KVM dumpfile.
(anderson@redhat.com)
- Fix the KVM-only "map" command to prevent its use when the session
is not being run against a KVM dumpfile, and to reject filename
arguments to the -a option or without the -f option.
(anderson@redhat.com)
(10/29/10)
5.0.8 - Fix for the "bt" command on 2.6.30 and later x86_64 kernels that
may be seen when a System.map file is used on the command line.
Without the patch, the "bt" frame-by-frame output may be interspersed
with error messages indicating "bt: invalid kernel virtual address:
<address> type: call byte".
(anderson@redhat.com)
- Fix for the KVM error messages generated by store_mapfile_offset() and
and load_mapfile_offset() when an invalid physical address is issued.
The errno translation displayed by both functions was irrelevant; and
load_mapfile_offset() has been changed to show its error message only
if CRASHDEBUG(1) is in effect, making its behaviour similar to the
read functions associated with the other dumpfile types.
(anderson@redhat.com)
- Fix for the "sig" command on 2.6.35 and later kernels to account for
the "signal_struct" member name change from "count" to "nr_threads".
Without the patch, the command would fail with the error message
"sig: invalid structure member offset: signal_struct_count".
(anderson@redhat.com)
- Fix for the "net -s" command option on 2.6.33 and later kernels
to account for the "inet_sock" structure member name changes from
"daddr", "rcv_saddr", "dport", "sport" and "num" to the equivalent
name preceded by "inet_". Without the patch, the command would fail
for tasks with open sockets with the error message "net: invalid
structure member offset: inet_opt_daddr".
(anderson@redhat.com)
- Fix for the "mod" command on 2.6.35 and later kernels to account
for the removal of the "owner" member from the "attribute" structure.
Without the patch, the "mod" command fail with the error message
"mod: invalid structure member offset: attribute_owner".
(anderson@redhat.com)
- Fix for the "mount -f" command on 2.6.36 and later kernels to account
for the data type change of the super_block "s_files" member from
"struct list_head" to "struct list_head __percpu *". The open files
of a super_block are no longer contained on a single list, and are
now linked onto one of the per-cpu lists. Without the patch the
command would fail with the error message "mount: invalid kernel
virtual address: <percpu-offset> type: first list entry".
(anderson@redhat.com)
- Fix for the "files" command when the vfsmount pointer in the file
structure's "f_path" member is not suitable for the root vfsmount to
be used when reconstructing the full file pathname. Without the
patch, open files in /dev directory may be truncated and not show
the "/dev" filename component.
(anderson@redhat.com)
- Change to the manner in which the cpu count is determined for x86_64
kernels. SLES11 2.6.32 kernels delay the call to crash_kexec() until
after smp_send_stop() is called by panic(), and so the cpu_online_map
cannot be used for determining the cpu count. With the patch, the
cpu_present_map is used.
(Jeffrey.Hagen@teradata.com)
- Fix for the "bt" command with 2.6.27 and later x86_64 kernels to
prevent the possible display of a an invalid "vgettimeofday" frame
above the topmost "system_call_fastpath" frame, followed by two
read errors indicating "bt: read error: kernel virtual address:
ffffffffff600000 type: gdb_readmem_callback".
(anderson@redhat.com)
- Currently the "s390dbf" command uses KL_UINT() for reading pointers,
which works only if the pointers are below 4 GiB. To fix this issue
a new KL_ULONG() function has been added to read pointers correctly.
(holzheu@linux.vnet.ibm.com)
- Implemented the capability of building crash as an x86 binary for ARM
dumpfiles on an x86_64 host. The initial ARM support only allowed
the building of an x86 binary for ARM dumpfiles to be done from an
x86 host. To build crash as an x86 binary on an x86_64 host, enter
"make target=ARM" for the initial build; subsequent builds with ARM
support can be accomplished by entering "make" alone.
(Jan.Karlsson@sonyericsson.com)
- Simplify the ARM build procedure after an initial ARM build has
been completed in a crash source tree. With the patch, it is only
necessary to enter "make target=ARM" for the intial build; subsequent
builds can be done with "make" alone, which will continue to build
with ARM support.
(Jan.Karlsson@sonyericsson.com, anderson@redhat.com)
- Implemented the capability of building an X86 crash binary on an
X86_64 host, which can be done by entering "make target=X86". After
the initial build is complete, subsequent builds can be done by
entering "make" alone.
(anderson@redhat.com)
- Fix for a regression in get_text_init_space() due to logic added by
the ARM processor support. Without the patch, the function would not
recognize the failure to find the kernel's .text.init address for
non-ARM architectures.
(perr.fransson.ml@gmail.com, anderson@redhat.com)
- Implemented support for SMP on the ARM architecture.
(per.xx.fransson@stericsson.com)
- Fix for the x86_64 "bt" command on 2.6.31 and later kernels when the
crash was generated by an "echo c > /proc/sysrq-trigger". Without
the patch, the backtrace starts at sysrq_handle_crash() and does
not display the exception frame from the forced oops. This is not
applicable to older kernels where crash_kexec() is called directly
from sysrq_handle_crash(), or if an actual alt-sysrq-c keystroke
sequence is entered.
(anderson@redhat.com)
- Fix to recognize module "init" symbols that are still valid, whose
vmalloc'd virtual memory has not been vfree'd by sys_init_module().
Without the patch those symbols are not visible by any of the "sym"
command options, nor by commands that try to translate their virtual
addresses to a symbol name, such as the "bt" command if the kernel
crashed during a module load.
(hutao@cn.fujitsu.com, anderson@redhat.com)
(10/06/10)
5.0.7 - Introduction of ARM processor support for the crash utility. This is
the result of a collaborative effort between Nokia and Sony Ericsson.
The crash utility can be built as a native ARM binary to analyze ARM
dumpfiles or run live on an ARM host, or alternatively it can be
built as an x86 binary to analyze ARM dumpfiles. To build crash as
an ARM binary on an ARM host, enter "make" alone. To build crash as
an x86 binary on an x86 host, enter "make target=ARM". By extension,
the x86 binary can also be run on an x86_64 host. It supports kdump,
and diskdump formats, and live using /dev/mem on ARM hosts. Stack
unwinding support uses both frame pointers and ARM unwind tables.
(ext-mika.1.westerberg@nokia.com, Jan.Karlsson@sonyericsson.com,
Thomas.Fange@sonyericsson.com)
- Fix to support KVM dumpfiles that have "ram" device header sections
with a version_id of 4. Without the patch, the crash session fails
with the error message "crash: qemu-load.c:267: ram_init_load:
Assertion `version_id == 3' failed".
(pbonzini@redhat.com, anderson@redhat.com)
- Fix for KVM dumpfiles from guests that were provisioned with more
than 3.5GB of RAM. KVM virtual systems contain an I/O hole in the
physical memory region from 0xe0000000 to 0x100000000 (3.5GB to 4GB).
If a guest is provisioned with more than 3.5GB of RAM, then the
memory above 3.5GB is "pushed up" to start at 0x100000000 (4GB).
But the "ram" device headers in the KVM dumpfiles do not reflect
that, and so without the patch, all kinds of error messages would be
displayed during invocation, and in all probability, the session
would fail.
(anderson@redhat.com)
- Minor fix to memory.c to address a compiler warning when building
with "make warn", or a compiler failure when using "make Warn".
(anderson@redhat.com)
- Fix for a segmentation violation caused by the "mount" command in the
rare circumstance where the "init" task (pid 1) does not exist.
(bob.montgomery@hp.com)
- CONFIG_PREEMPT_RT x86_64 realtime kernels allocate only 3 exception
stacks to handle the 5 possible exception types, and therefore the
same per-cpu stack may be used for different exception types. This
could cause "bt" output that contained exception stack name strings
to be incorrect. The patch displays all exception stack name strings
in RT kernels to all show "RT", as in "--- <RT exception stack> ---".
(anderson@redhat.com)
- Fix for the potential to miss one or more tasks in 2.6.23 and earlier
kernels, presumably due to catching an entry the kernel's pid_hash[]
chain in transition. Without the patch, the task will simply not be
seen in the gathered task list.
(bob.montgomery@hp.com)
- Fix to correct a presumption that the kernel's task_state_array[]
is NULL terminated.
(holzheu@linux.vnet.ibm.com)
(08/27/10)
5.0.6 - Also available in Fedora Rawhide devel branch:
build: dist-f14 crash-5.0.6-2.fc14
http://koji.fedoraproject.org/koji/buildinfo?buildID=184746
- Fix for support of xendump and Xen kdump dumpfiles from 2.6.27 and
and later x86_64 kernels. Without the patch, the crash session
would fail during initialization with the error message "crash:
cannot resolve end_pfn".
(ptesarik@suse.cz)
- Fix for the "s390dbf" command to allow the command's output to be
redirected to a pipe.
(holzheu@linux.vnet.ibm.com)
- Fix for the x86 "bt" command to generically recognize the end of
trace condition for tasks entering the kernel from user-space without
having to hardwire any more kernel entry point function names.
Without the patch, a task that took a clock interrupt from user-space
and crashed while operating on the soft IRQ stack failed with the
error message "bt: cannot resolve stack trace".
(anderson@redhat.com)
- Display the "machine type mismatch" warning when attempting to use a
ppc64 vmlinux file on a non-ppc64 64-bit host. Without the patch,
the fact that ppc64 vmlinux ELF files are type ET_DYN, and not type
ET_EXEC like all of the other architectures, was allowing the vmlinux
to be accepted without the mismatch warning, and would subsequently
fail without a meaningful explanation being displayed.
(anderson@redhat.com)
- Fix for the x86_64 "bt" command if the kdump-generated NMI interrupts
a multi-threaded task that has just entered kernel space but has not
changed the RSP stack pointer register from its user-space per-thread
stack location to the kernel stack. Without the patch, the command
follows the display of the exception frame on the NMI exception stack
with the message "WARNING: possibly bogus exception frame", displays
the error message "bt: cannot transition from exception stack to
current process stack", and does not display the user-space exception
frame.
(anderson@redhat.com)
- Added the "set" command to the list of commands acceptable when
running in --minimal mode. The command is limited to the setting
of internal variables since there is no task context in that mode.
(anderson@redhat.com)
- Fix for the "vtop" command when run against x86 Xen PAE kernels.
Without the patch, the "PAGE:" displays (machine and pseudo-physical)
contained non-zero values in the lower 12 bits, and the translation
of the PTE entry was incorrect as a result of receiving the incorrect
contents in the lower 12-bits of the PTE entry.
(anderson@redhat.com)
- Implemented support for running against live x86_64 pv_ops/Xen guest
kernels.
(anderson@redhat.com)
- Implemented support for xendump ELF dumpfiles generated from x86_64
pvops/Xen guest kernels.
(anderson@redhat.com)
- Implemented support for running against live x86 pv_ops/Xen guest
kernels.
(anderson@redhat.com)
- Implemented support for xendump ELF dumpfiles generated from x86
pvops/Xen guest kernels.
(anderson@redhat.com)
- Determine the bit positions of PG_reserved and PG_slab using the
newer pageflags enumerator values if available.
(anderson@redhat.com)
- Fix to prevent the "repeat" command from keeping a crash session
alive if the controlling terminal session is killed.
(anderson@redhat.com)
(07/20/10)
5.0.5 - Implemented a new "bt -F" flag as an extension of the "bt -f" flag,
which dumps the contents of each stack frame in a backtrace; but
similar to the output of "rd -S", if a stack entry can be expressed
symbolically it will be displayed as "symbol+offset", or if a stack
entry comes from a slab cache, the slab cache name will be displayed
inside brackets.
(anderson@redhat.com)
- Fix for the %install stanza in the crash.spec file to check for the
existence of the sial.so extension module. Without the patch, when
building the src.rpm with rpmbuild on a system without the bison and
flex packages, the installation phase fails with the error messages
"cp: cannot stat `extensions/sial.so': No such file or directory" and
"error: Bad exit status from /var/tmp/rpm-tmp.ubJZfY (%install)".
(anderson@redhat.com)
- Minor correction to the error message displayed by the "crash -x"
command line option when the pre-loading of an extension module fails.
Without the patch, the error message may be preceded with "foreach:"
instead of "crash:" when running with dumpfiles in which no panic
task could be ascertained.
(anderson@redhat.com)
- Fix for the x86_64 "bt" and "bt -E" commands if the kdump-generated
NMI interrupts a task that has just entered kernel space but has
not changed the RSP stack pointer register from its user-space
location to the kernel stack. Without the patch, the "bt" command
follows the display of the exception frame on the NMI exception
stack with the message "WARNING: possibly bogus exception frame",
displays the error message "bt: cannot transition from exception
stack to current process stack", and does not display the user-space
exception frame; the "bt -E" command does not find and display the
kernel-space exception frame generated by the NMI.
(anderson@redhat.com)
- Fix for compiler warnings when building "net.c" with -O2. Without
the patch, there are 6 messages that indicate "warning: dereferencing
pointer '<arg>' does break strict-aliasing rules".
(anderson@redhat.com)
- Fix for a compiler warning when building "gdb_interface.c" with -O2.
Without the patch, there is 1 message that indicates "warning:
dereferencing type-punned pointer will break strict-aliasing rules".
(anderson@redhat.com)
- Fix for the x86 "bt" command if the crash occurs during the execution
of a kernel module's init_module() function. Without the patch, the
backtrace attempt would contain a stack frame with a "(null)" symbol
name, display "bt: cannot resolve stack trace", dump the text symbols
on the stack, and any possible exception frames.
(anderson@redhat.com)
- Fix to support newer KVM dumpfile format generated by "virsh dump"
that may contain "block" and "kvmclock" devices in the dumpfile
header. Without the patch, the session fails with the error message
"crash: <dumpfile>: initialization failed".
(pbonzini@redhat.com, anderson@redhat.com)
- Fix for "kmem -[sS]" command on 2.6.34 and later CONFIG_SLUB kernels,
which changed the kmem_cache.cpu_slab[NR_CPUS] array to be a per-cpu
offset value. Without the patch, the command fails with the message
"kmem: cannot determine location of kmem_cache.cpu_slab page".
(anderson@redhat.com)
- Modified the "kmem -p" output to show the "INDEX" column values with
a hexadecimal value because the "page.index" member is typically a
shared-use field that may also contain a pointer value. Without the
patch, pointer values would be displayed as large negative decimal
values.
(anderson@redhat.com)
- Addressed compiler warnings generated by net.c when built with -O2.
(anderson@redhat.com)
- Fix for the "kmem <address>" command if the kernel's free page
lists are corrupt, or in a state of flux, and cannot be followed.
Without the patch, the search for page usage in the various locations
is aborted if the free page lists cannot be traversed.
(anderson@redhat.com)
- Fix to read KVM dumpfiles generated by the "virsh dump" of a RHEL5
guest from a RHEL6 host, and to support dumpfile format changes that
contain "apic" and "__rhel5" devices. Without the patch, the session
may fail during initialization with a segmentation violation, or with
the error message "crash: <dumpfile>: initialization failed".
(pbonzini@redhat.com)
(06/16/10)
5.0.4 - Fix for the x86 "bt" command when a newly-forked task's resumption
EIP address value is set to the "ret_from_fork" entry point by
copy_thread(). Without the patch, the backtrace attempt would
display "bt: cannot resolve stack trace", dump the text symbols on
the stack, and a possible USER-MODE exception frame.
(anderson@redhat.com)
- Fix for the x86 "bt" command if the kdump-generated NMI interrupts
a task running in kernel space at a point in the system_call entry
point code prior to the call to a system call function. Without the
patch, the backtrace attempt would display "bt: cannot resolve stack
trace", dump the text symbols on the kernel stack, and display any
"KERNEL-MODE" exception frames followed by a possible "USER-MODE"
exception frame.
(anderson@redhat.com)
- Fix for the "bt" command on 2.6.29 and later x86_64 kernels to
recognize and display exception frames generated by exceptions that
do not result in a stack switch, such as general protection faults.
Without the patch, the backtrace would potentially not display the
exception frames because the "error_exit" assembly-code label in
entry_64.S was replaced by the error_exit() entry point.
(anderson@redhat.com)
- The kernel patch for ppc64 CONFIG_SPARSEMEM_VMEMMAP kernels that
stores vmemmap page mapping information so that the crash utility
is able to translate vmemmap'd kernel virtual addresses has been
updated. The crash utility patch that was (preemptively) applied
in 5.0.2 for the initial kernel patch needs this update.
(anderson@redhat.com)
- Fix the error message for the "dev -p" comand when run on 2.6.26
or later kernels, which no longer have the global "pci_devices"
list head. The patch changes the message to show "dev: -p option
not supported or applicable on this architecture or kernel", instead
of the misleading "dev: no PCI devices found on this system" message.
(anderson@redhat.com)
- If a cpu in an s390 or s390x dumpfile is offline, and the "bt"
command receives a backtrace request for the "swapper" task on that
cpu, the command will display "CPU offline".
(holzheu@linux.vnet.ibm.com)
- Fix for 2.6.34 and later x86_64 kernels which generate per-cpu
symbols of type 'd' or type 'D' instead of type 'V'. Without the
patch, an x86_64 crash session fails during initialization with the
error message "crash: cannot determine idle task addresses from
init_tasks[] or runqueues[]", followed by "crash: cannot resolve
init_task_union". It is unclear why some kernel builds result in
only type 'V' per-cpu symbols, whereas others result in in type 'd'
and 'D', so the patch accepts both.
(Kashyap.Desai@lsi.com)
- Fix to prevent a segmentation violation during initialization in
the x86_64_get_active_set() function by verifying that the array
of current tasks in machdep->machspec->current[] has actually been
allocated. Theoretically it should never be NULL, but in the
unlikely event that x86_64_per_cpu_init() fails to find the required
per-cpu symbols, it will return without allocating the array.
(anderson@redhat.com)
- Fix to support KVM dumpfiles created with "virsh dump" that create
"cpu" header sections using a QEMU CPU_SAVE_VERSION version greater
than the supported version of 9. Without the patch, the crash
session fails during initialization with the error message "crash:
qemu-load.c:501: cpu_init_load_64: Assertion `version_id >= 4 &&
version_id <= 9' failed." The patch now accepts CPU_VERSION_VERSION
values up to 12.
(anderson@redhat.com)
- Fix for x86_64 KVM dumpfiles created with "virsh dump" whose kernels
have a "_text" virtual address higher than __START_KERNEL_map.
Without the patch, the physical base address calculation fails,
making the dumpfile unusable.
(anderson@redhat.com, pbonzini@redhat.com)
- Implemented a new "map" command that is seen only when running with
KVM guest dumpfiles created with "virsh dump". The layout of this
dumpfile format does not allow the access of system memory in a
"random-access" manner. Therefore, during session initialization, a
potentially time-consuming dumpfile scan procedure is required to
create a physical-memory-to-file-offset memory map for use during the
session. The new "map" command allows the user to either append the
memory map to the end of the dumpfile, or to create a discrete memory
map file. In either case, the dumpfile scan will not be required
during subsequent sessions. The command's help page may be seen by
entering "crash -h map".
(anderson@redhat.com)
- Fix for an incorrect calculation of the physical base address of a
fully-virtualized x86_64 RHEL6 guest kernel running on a RHEL5 Xen
host. Without the patch, the session failed during initialization
with the error messages "crash: cannot determine base kernel version"
and "crash: vmlinux and vmcore do not match!"
(anderson@redhat.com)
- Fix for the "bt" command on inactive (blocked) tasks on 2.6.33 and
later x86_64 kernels, which have the "thread_return" symbol removed
from the embedded "switch_to" macro. Without the patch, when run on
blocked tasks, the command would fail with the error message "bt:
cannot resolve thread_return".
(anderson@redhat.com)
- Fix for the "bt" command on 2.6.33 and later x86 kernels, which moved
the "system_call" assembly function to the .kprobes.text section.
Without the patch, the command would typically display two invalid
stack frames, both indicating they were in "ia32_sysenter_target".
(anderson@redhat.com)
- Fix for a segmentation violation caused by the "extensions/trace.c"
extension module, as seen when running the "trace show -c <cpu>"
command from that module.
(laijs@cn.fujitsu.com)
- Implemented a "trace dump -t" command for the "extensions/trace.c"
extension module. The module already has a "trace show" command
to show what events had happened before the system crashed, but it
is just 1000 lines of code and it is not as complete as the related
"trace-cmd report" command from trace-cmd(1). The new extension
module command generates a "trace.dat" file, which in turn can be
used by the "trace-cmd report" option of trace-cmd(1). So this
patch improves both the crash trace command and the trace-cmd(1)
as well, which can now handle ftrace even if the kernel crashed.
(laijs@cn.fujitsu.com)
- 5.0.3 to 5.0.4 incremental patch
(05/21/10)
5.0.3 - Fix for running against 2.6.34 and later kernels to recognize and
handle changes in the kernel's per-cpu data symbol naming, which
no longer prefixes "per_cpu__" to declared per-cpu symbol names.
Without the patch, an x86_64 crash session fails completely during
initialization with the error message "crash: cannot determine idle
task addresses from init_tasks[] or runqueues[]", followed by "crash:
cannot resolve init_task_union"; on architectures such as the x86,
the session comes up to the "crash>" prompt, but displays warning
messages such as "WARNING: duplicate idle tasks?", the "swapper"
tasks are not found, and any command accessing per-cpu data fails.
(anderson@redhat.com)
- Fix for "swap" and "kmem -i" commands on 2.6.29 or later, big-endian,
ppc64 kernels, where the swap_info_struct.flags member was changed
from an int to a long. Without the patch, the "swap" command does
not display any swap data, and the "kmem -i" command indicates that
there is no swap memory used or available.
(anderson@redhat.com)
- Fix for the "vm" and "ps" command's task RSS value on 2.6.34 or later
kernels. Without the patch, the RSS value would always show "0k" or
"0" respectively, due to the replacement of the mm_struct._anon_rss
and mm_struct._file_rss counters with the mm_struct.rss_stat member.
(anderson@redhat.com)
- Fixed "possible aternatives" spelling typo used in informational
messages when an incorrect/unknown symbol name is used in "rd",
"dis", "sym" and "struct" commands.
(anderson@redhat.com)
- Fix for CONFIG_SPARSEMEM kernels that are not configured with
CONFIG_SPARSEMEM_EXTREME. Without the patch, "kmem -n" would show
faulty/missing sparsemem memmap data, and as a result, commands
requiring verification of page structure addresses contained within
the "missing" memmap sections would fail or show questionable data.
(nishimura@mxp.nes.nec.co.jp, anderson@redhat.com)
- Change the output of the "kmem -[cC]" options to indicate that they
are not supported when that is relevant. "kmem -c" has been obsolete
since 2.6.17, and "kmem -C" has been obsolete in all 2.6 era kernels.
(anderson@redhat.com)
- 5.0.2 to 5.0.3 incremental patch
(04/08/10)
5.0.2 - Fix for the "mod -[sS]" command if the attempt to load a kernel
module fails due to an internal gdb error. Without the patch, the
"mod" command displays error messages of the sort:
*** glibc detected *** crash: double free or corruption (!prev): <address> ***
<segmentation violation in gdb>
mod: <module-name>
gdb add-symbol-file command failed
and then hangs. With the patch, a module-related error message is
displayed, the "mod" command fails, and the session continues.
(anderson@redhat.com)
- Fix for the "mod -[sS]" command options, which may display the error
message "mod: <module>: last symbol is not _MODULE_END_<module>?"
for one or more modules. That message indicates that the module's
symbol values have been incorrectly modified by the "mod" command,
and even if the error message is not displayed, it is still possible
that the symbol values of some modules may have been incorrectly
modified. With the fix, the "mod -[sS] command will not recalculate
and modify module symbol values from their CONFIG_KALLSYMS-generated
values.
(anderson@redhat.com)
- Fix for the reading of dumpfiles created with the "snap" extension
module when used on an x86 machine with a single PT_LOAD segment that
starts at a non-zero address. Without the patch, a crash session
with such an x86 snapshot dumpfile fails during initialization with
the error message "crash: vmlinux and <snapshot> do not match!"
(anderson@redhat.com)
- Fixes for several bugs in the s390 and s390x stack backtrace code:
(1) Add panic stack as second interrupt stack
(2) Fix printing of access registers (4 bytes instead of 8 bytes)
(3) Use u64 for s390x register 14
(4) Fix interrupt stack handling for s390x (use 160 byte overhead
instead of 96)
(holzheu@linux.vnet.ibm.com)
- Fix for the "mach -m" command option on x86 or x86_64 systems whose
BIOS-provided e820 map contains EFI-related memory type value that
has not been mapped to an E820 type (pre-2.6.27), or if the type is
E820_UNUSABLE (2.6.28 and later). Without the patch, the "mach -m"
command would result in a segmentation violation. With the fix,
an EFI type will be displayed as "type <number>" on pre-2.6.27
kernels, and the mapped E820 type on 2.6.27 and later kernels.
(anderson@redhat.com)
- Fix for SIAL extension module if a script uses structures that
contain members of type "bool". Without the patch, running such
a script fails with the error message "File <filename>, line 279,
Error: Oops drilldowntype".
(holzheu@linux.vnet.ibm.com)
- Fix to prevent a stream of harmless but annoying error messages when
running "crash -d4" (or any larger -d debug value) on x86 machines.
Without the patch, after the "crash: get_cpus_online: online: <cpus>"
debug message, there are a stream of "crash: input string too large:"
and "crash: invalid input:" messages prior to the next legitimate debug
message.
(anderson@redhat.com)
- Fix for the "kmem -s list" command option on non-CONFIG_SLUB kernels
that contain a "cache_chain" list_head symbol instead of having a
"#define cache_chain (cache_cache.next)" construct. Without the
patch, the command would incorrectly presume that the "cache_chain"
address was that of a kmem_cache structure, may display a warning
message "kmem: WARNING: cannot read kmem_cache_s.name string at
<address>", and then show the "cache_chain" symbol address followed
either by a name of "(unknown)" or by a string of gibberish.
(anderson@redhat.com)
- Fix for the x86_64 "bt" command to recognize, and take advantage of,
kernels that were built with CONFIG_FRAME_POINTER. In that case, the
frame pointer values pushed onto the kernel stack are now used to
calculate stack frame sizes, resulting in more accurate backtraces.
(anderson@redhat.com)
- Change the ppc64 cpu count displayed by the initial system banner
and by the "sys" and "mach" commands to be the number of cpus online.
(lnx1138@linux.vnet.ibm.com)
- Fix for the x86_64 "bt" command's stack frame size calculator on
kernels that were built without CONFIG_FRAME_POINTER. Without the
patch, in the relatively rare case where a function does a "retq"
prior to the targeted text return address, the frame size calculation
could be too small, which in turn could result in an intervening,
stale, frame entry.
(anderson@redhat.com)
- Fix to prevent a crash session that is run over a network connection
that is killed/removed from going into 100% cpu-time loop. Without
the patch, the behavior of the built-in readline() library call in
gdb-7.0 has changed such that the function returns when the EOF is
encountered on /dev/tty, and the crash session goes into an endless
loop; whereas in gdb-6.1, the readline() call never returns because
the crash session gets killed while running in the library code.
(anderson@redhat.com)
- Change the output of "ps -t" to display the task_struct's utime and
stime values unmodified on kernels using a cputime_t (unsigned long)
to store those values.
(anderson@redhat.com)
- Fix for the x86 "bt" command if the kdump-generated NMI interrupts
a process in kernel space at a point before the full user-mode
exception frame (pt_regs) gets written on the kernel stack. Without
the patch, the backtrace attempt would display "bt: cannot resolve
stack trace", dump the text symbols on the kernel stack, and would
not find/display a "USER-MODE" exception frame; the fix simply shows
the interrupted entry-point function name and stack pointer.
(anderson@redhat.com)
- Fix for the "bt -e" command on 2.6.30 or later x86 kernels if the
x86.c file was built with D_FORTIFY_SOURCE. Without the patch, the
command would cause the crash session to abort with the error message
"*** buffer overflow detected ***: crash terminated".
(anderson@redhat.com)
- Fix for initialization-time failure on 2.6.34 and later kernels that
were configured with CONFIG_NO_BOOTMEM. Without the patch, the crash
session fails with the error message "crash: invalid structure member
offset: pglist_data_bdata".
(anderson@redhat.com)
- Fix for the processor speed value displayed on ppc and ppc64 machines
at session invocation, and by the "sys" and "mach" commands. Without
the patch, Power6 machines indicate "(unknown Mhz)".
(pavan@linux.vnet.ibm.com)
- Implemented support to recognize an IBM-proposed kernel patch for
ppc64 CONFIG_SPARSEMEM_VMEMMAP kernels that will store vmemmap page
mapping information. Currently on 2.6.26 and later ppc64 kernels
configured with CONFIG_SPARSEMEM_VMEMMAP, there is an initialization
time warning message indicating "WARNING: cannot translate vmemmap
kernel virtual addresses: commands requiring page structure contents
will fail", alerting the user that vmemmap'd page structures cannot
be accessed. When the kernel patch is eventually applied, this patch
will recognize it and be able to translate vmemmap'd kernel virtual
addresses.
(anderson@redhat.com)
- Fix for "kmem -[sS]" command options on live CONFIG_SLAB systems to
prevent the redundant reading of the shared array_cache object list
from the per-node kmem_list3 data structures. Without the patch, it
is possible that there could be a series of error messages indicating
"kmem: <cache-name> cache: total shared array_cache.avail <number>
greater than total limit <number>", followed by "*** glibc detected
*** crash: double free or corruption (!prev): <address> ***", a
backtrace, and the abort of the crash session.
(anderson@redhat.com)
- 5.0.1 to 5.0.2 incremental patch
(03/26/10)
5.0.1 - Due to a change in the x86 disassembler output from the embedded
gdb-7.0 that was introduced in crash version 5.0.0, there may be
a stream of warning messages during invocation that indicate
"crash: invalid input: <string>:" and "crash: input string too
large: <string>: (9 vs 8)" on 2.6.20 and earlier x86 kernels.
(anderson@redhat.com)
- As of glibc 2.11, the mkstemps() function has been introduced as a
versioned symbol. As a result, crash utility binaries built on host
machines with glibc 2.11 or later cannot be run on systems that run
pre-2.11 glibc versions, failing during invocation with the error
message "crash: relocation error: crash: symbol mkstemps, version
GLIBC_2.11 not defined in file libc.so.6 with link time reference".
With the patch, the pre-existing version of mkstemps() from the
built-in libiberty.a library will always be used.
(jmoyer@redhat.com)
- Fix for the "irq" command on 2.6.33 and later kernels to account for
the removal of the irqaction.mask structure member. Without the
patch, the "irq" command fails with the error message "irq: invalid
structure member offset: irqaction_mask".
(bernhard@bwalle.de)
- Added a defensive mechanism to handle a corrupted "cache_cache"
kmem_cache structure. Without the patch, a vmcore that had such
a corruption caused a failure during invocation with the error
message "crash: zero-size memory allocation!".
(anderson@redhat.com)
- Fix for the "swap", "kmem -i", and "vm -p" commands to account for
the 2.6.33 kernel changes to the swap_info_struct data structure and
the swap_info[] array type. Without the patch, "swap" would show
only the command's header, "kmem -i" would show zero swap usage, and
"vm -p" would show "(unknown swap location)" when translating the
swap file name for any swapped-out pages in the task.
(anderson@redhat.com)
- Fix for a segmentation violation during session invocation when
running against 2.6.30 or later x86_64 dumpfiles whose kernel is not
configured with CONFIG_SMP.
(anderson@redhat.com)
- Fix for the "bt" command on an ia64 "INIT" process that interrupted
a task that was running in user space, but was unable to modify the
original (interrupted) task's stack. Without the patch, the "INIT"
task's backtrace would not display the task that was interrupted,
and would display the error message "bt: unwind: failed to locate
return link (ip=<user-virtual-address>)!". With the patch, the
interrupted task information is displayed in the same manner as if
the original stack had been modified.
(tindoh@redhat.com)
- Fix for x86, s390, s390x and ia64 architectures to set the system
cpu count equal to the highest cpu online plus one. Without the
patch, those architectures would use the number of online cpus as
the system's total cpu count, which would be misleading when any
offline cpu number was less than the highest online cpu number.
(anderson@redhat.com)
- Fix for package build failure on x86_64 when using gcc-4.5. Without
the patch, these types of errors are generated:
unwind_x86_32_64.c:50:2: error: initializer element is not constant
unwind_x86_32_64.c:50:2: error: (near initialization for 'reg_info[7].offs')
unwind_x86_32_64.c:50:2: error: initializer element is not constant
unwind_x86_32_64.c:50:2: error: (near initialization for 'reg_info[8].offs')
(troy.heber@hp.com)
- Fix to recognize the symbol type change of per-cpu variables from
'd' or 'D' to 'V'. Without the patch, entering a command of the
form "p per_cpu__<variable>" would fail with the error message
"p: gdb request failed: p per_cpu__<variable>". With the fix,
the symbol is recognized as a per-cpu variable, in which case the
data type of the variable is displayed, followed by a list of the
virtual addresses of each per-cpu instance of the variable.
(anderson@redhat.com)
- Fix for the "struct" and "union" commands when passed an address that
is in a valid kernel virtual address region but is either unmapped or
non-existent. Without the patch, the following three error messages
are displayed:
struct <name> struct: invalid kernel virtual address:
<kernel-address> type: "gdb_readmem_callback"
gdb called without error_hook: Cannot access memory at address
<kernel-address>
*** glibc detected *** crash: double free or corruption (!prev):
<crash-address> ***
followed by a backtrace and the crash utility memory map. The session
aborts at that point. With the fix, the commands will fail gracefully
after displaying error messages reporting that the kernel virtual
address cannot be accessed.
(anderson@redhat.com)
- Update for 2.6.33 and later s390 and s390x kernels to account for the
"_lowcore" structure member name change from "st_status_fixed_logout"
to "psw_save_area".
(holzheu@linux.vnet.ibm.com)
- Fix for very large Xen domU dumpfiles that locate the base offset of
relevant ELF sections beyond the 4GB mark. Without the patch, the
crash session fails with the error messages "crash: cannot find mfn
<number> (0x<number>) in page index" followed by "crash: cannot
read/find cr3 page".
(anderson@redhat.com, xiaowei.hu@oracle.com)
- If a kernel crash occurs during a kernel module loading operation,
it is possible that a subsequent crash session on the vmcore may
result in a segmentation violation during the "please wait...
(gathering module symbol data)" phase.
(john.wright@hp.com)
- Fix for a gdb-7.0 regression that causes the line number capability
to fail with certain ranges of x86 base kernel text addresses.
Without the patch, the "dis -l <symbol>" or "sym <symbol>"
commands would fail to show line number information for certain
ranges of base kernel text addresses.
(anderson@redhat.com)
- Fix for the "bt" command when run on offline s390/s390x "swapper"
idle tasks. Without the patch, the command fails with the error
message "bt: invalid kernel virtual address: ffffffffffffc000
type: async_stack".
(holzheu@linux.vnet.ibm.com)
- Preparation for future s390x ELF dumpfile format.
(holzheu@linux.vnet.ibm.com)
- 5.0.0 to 5.0.1 incremental patch
(02/18/10)
5.0.0 - Updated embedded gdb version to FSF gdb-7.0.
(anderson@redhat.com)
- Fix for the ppc64 "irq" command where the "irq_desc_t" is no longer
recognized as a typedef for "struct irq_desc". Without the patch,
the command fails with the error message: "irq: invalid structure
size: irqdesc".
(anderson@redhat.com)
- Fix for 2.6.26 and later ppc64 CONFIG_SPARSEMEM_VMEMMAP kernels to
recognize VMEMMAP_REGION virtual addresses. The kernel's memmap page
structure array(s) are mapped in that region, and without the fix,
the vmemmap virtual addresses were being erroneously translated using
the kernel's page tables that map the VMALLOC_REGION. This in turn
led to bogus data being read for all page structure content requests,
resulting in invalid error messages for commands such as "kmem -s",
"kmem -p", "kmem -f", etc. A secondary issue is that there is no
current manner for the crash utility to be able to translate vmemmap
addresses because there is no record of the the mapping stored in the
kernel. That being the case, any command that needs to read the
contents of a page structure will fail. During initialization, the
message "WARNING: cannot translate vmemmap kernel virtual addresses:
commands requiring page structure contents will fail" will alert the
user of the problem. During runtime, an attempt to read the contents
of a vmemmap'd page structure will fail with the error message
"<command>: cannot translate vmemmap address: <vmemmap address>".
(anderson@redhat.com)
- Fix for segmentation violation when running the "ps -r" command
option on 2.6.25 or later kernels.
(anderson@redhat.com)
- Fix for the "mount" command on 2.6.32 and later kernels. Without the
patch, the command would fail immediately with the error message
"mount: invalid structure member offset: super_block_s_dirty". Also,
the "mount -i" option will no longer be supported in 2.6.32 and later
kernels because the super_block.s_dirty linked list no longer exists.
(anderson@redhat.com)
- Fix for the "bt" command on 2.6.29 and later x86_64 kernels to
always recognize and display BUG()-induced exception frames. Without
the patch, the backtrace would potentially not display the exception
frame.
(anderson@redhat.com)
- Fix for the "rd" and "kmem" commands to prevent the unnecessary
"WARNING: sparsemem: invalid section number: <number>" message
when testing whether an address is represented by a page structure
in CONFIG_SPARSEMEM_EXTREME kernels.
(anderson@redhat.com)
- Fix for a 4.0-8.11 regression that introduced a bug in determining
the number of cpus in ppc64 kernels when the cpu_possible_[map/mask]
has more cpus than the cpu_online_[map/mask]. In that case, the
kernel contains per-cpu runqueue data and "swapper" tasks for the
extra cpus. Without the patch, on systems with a possible cpu count
that is larger than its online cpu count:
(1) the "sys" command will reflect the possible cpu count.
(2) the "ps" command will show the existent-but-unused "swapper"
tasks as active on the extra cpus.
(3) the "set" command will allow the current context to be set to
any of the existent-but-unused "swapper" tasks.
(4) the "runq" command will display existent-but-unused runqueue
data for the extra cpus.
(5) the "bt" command on the existent-but-unused "swapper" tasks will
indicate: "bt: cannot determine NT_PRSTATUS ELF note for active
task: <task>" on dumpfiles, and "(active)" on live systems.
(anderson@redhat.com)
- 4.1.2 to 5.0.0 incremental patch
(01/06/10)
4.1.2 - Fix for 2.6.31 or later x86_64 CONFIG_NEED_MULTIPLE_NODES kernels
running on systems that have multiple NUMA nodes. By default, those
kernels use the "page" (or "lpage") percpu memory allocators, which
utilize vmalloc space for percpu memory. Without the patch, the
crash session would fail during initialization with the error message
"crash: cannot determine idle task addresses from init_tasks[] or
runqueues[]", followed by "crash: cannot resolve init_task_union".
(anderson@redhat.com)
- Fix for the snap.c extension module to properly handle NUMA systems
with multiple nodes, or single node systems whose first unity-mapped
PT_LOAD segment starts on a non-zero physical address. Without the
patch, a crash session on the resultant vmcore would fail with the
error message: "crash: vmlinux and <filename> do not match!"
(anderson@redhat.com)
- Added a defensive mechanism to handle corrupt Elf32_Phdr/Elf64_Phdr
structures in an ELF vmcore. Without the patch, a hand-carved bogus
p_offset field in a Elf32_Phdr/Elf64_Phdr structure could possibly
cause a segmentation violation during inialization. With the fix,
if an invalid Elf32_Phdr or Elf64_Phdr p_offset field is encountered,
a warning message will be displayed, and the crash session will bail
out gracefully, or continue on if possible.
(anderson@redhat.com)
- Added a defensive mechanism to handle corrupt Elf32_Ehdr/Elf64_Ehdr
structures in an ELF vmcore. Without the patch, a hand-carved bogus
e_phnum field in a Elf32_Phdr/Elf64_Phdr structure could possibly
cause a segmentation violation during inialization. With the fix,
if an invalid Elf32_Ehdr or Elf64_Ehdr e_phnum field is encountered,
a warning message will be displayed and the crash session will bail
out gracefully.
(anderson@redhat.com)
- More non-functional changes for future integration of gdb-7.0 and
for addressing Fedora packaging guidelines.
(anderson@redhat.com)
- Fix for the x86 "bt [-t|-T]" commands when the backtrace passes
through three stacks, which can happen when an interrupt is taken
while operating on a per-cpu soft IRQ stack, and the crash occurs
while operating on the per-cpu hard IRQ stack. Without the patch,
the "bt" command terminates after displaying backtrace on the hard
IRQ stack; "bt -t" displays the stack contents of the hard IRQ stack
but stops with the error message "bt: non-process stack address for
this task: <task-address>"; "bt -T" displays the the same error
message as "bt -t", but displays the stack contents of the process
stack. With the fix, all three "bt" invocations will display the
backtraces or kernel text addresses on all three stacks, correctly
transitioning from the hard IRQ stack to the soft IRQ stack to the
process stack.
(anderson@redhat.com)
- When handcrafting the backtrace starting point for the "bt" command
by using the -S options, and the starting stack address is not in
the task's process stack, a message gets displayed that indicates
"non-process stack address for this task". However, if the starting
stack address is a legitimate non-process stack address, such as a
hard or soft IRQ stack address, or an x86_64 exception stack address,
the message is confusing, and has been removed.
(anderson@redhat.com)
- 4.1.1 to 4.1.2 incremental patch
(12/09/09)
4.1.1 - Fix for a potential session initialization failure when running
against 2.6.30 or later x86_64 kernel dumpfiles whose pages have been
filtered by the the makedumpfile facility. Without the patch, the
session may fail with the error message "crash: page excluded: kernel
virtual address: <address> type: cpu number (per_cpu)", but will
initialize OK if the "--zero_excluded" command line option is used.
(anderson@redhat.com)
- Added "lsmod" as a built-in alias for the "mod" command.
(anderson@redhat.com)
- Added a defensive mechanism to handle corrupt Elf32_Nhdr/Elf64_Nhdr
structures in an ELF vmcore. The fix no longer presumes that all
Elf32_Nhdr/Elf64_Nhdr structure contents are legitimate, and if an
invalid Elf32_Nhdr or Elf64_Nhdr structure is encountered, it will
be ignored and a warning message will be displayed showing the
structure contents, and the crash session will continue on. Without
the patch, it was possible that an invalid n_namesz or n_descsz
value could cause a segmentation violation when attempting to read
the bogus note contents.
(anderson@redhat.com)
- Fix for "mach -c" command option on 2.6.30 and later x86_64 kernels
in which the per-cpu array x8664_pda data structures were replaced
with per-cpu variables. Without the patch, the command displays
just the boot cpu's cpuinfo data structure and then fails with the
error message: "mach: invalid structure name: x8664_pda".
(anderson@redhat.com)
- Fix to properly set the DEBUG exception stack size and stack base
address on 2.6.18 and later x86_64 kernels. Without the patch, the
DEBUG exception stack was presumed to be the same size as all of the
other exception stacks, so in the extremely rare occurrance that a
kernel crash started while running on a per-cpu DEBUG stack, the
backtrace code would not recognize it as such, and would either start
the trace using stale starting stack hooks, typically from "schedule"
while running on the process stack, or the backtrace attempt would
fail with the error message "bt: cannot transition from exception
stack to current process stack".
(anderson@redhat.com)
- Related to the above, when the x86_64 "bt" is displaying a trace
segment from one of the five exception stacks, change the output from
showing just "--- <exception stack> ..." to showing which exception
stack it's working from, for example, "--- <NMI exception stack> ---"
or "--- <DEBUG exception stack> ---", etc.
(anderson@redhat.com)
- Fix for a session initialization failure when running against 2.6.30
or later x86_64 kernels if the number of possible cpus equals the
kernel's configured NR_CPUS. Without the patch, the session fails
with the error message "crash: invalid kernel virtual address: cc08
type: cpu number (per_cpu)".
(bob.montgomery@hp.com)
- Preparations in the top-level source code for the integration of
gbd-7.0. The current embedded version remains gdb-6.1.
(anderson@redhat.com)
- 4.1.0 to 4.1.1 incremental patch
(11/20/09)
4.1.0 - Fix for s390x and x86 "extend" command regression created by the
"crash -x" option introduced in crash version 4.0.9. Without the
patch, the "extend" command on s390x and x86 machines fail with the
error message: "extend: <module>.so: not an ELF format object file".
(holzheu@linux.vnet.ibm.com, anderson@redhat.com)
- Cleanup of top-level source files to address compiler warnings
generated by the CFLAGS used in the Fedora build environment:
main.c ppc64.c tools.c symbols.c defs.h qemu-load.c qemu.c
xen_hyper_command.c xendump.c netdump.c s390_dump.c lkcd_common.c
remote.c cmdline.c x86_64.c net.c dev.c kernel.c task.c filesys.c
memory.c lkcd_x86_trace.c ppc64.c x86.c s390.c s390x.c s390dbf.c
Only two bugs (s390/s390x) were discovered as a result of this
exercise. The vast majority of the warnings were primarily benign
"may be used uninitialized in this function" false-positive warnings,
but were addressed nonetheless. A few "dereferencing type-punned
pointer will break strict-aliasing rules" warnings still exist, but a
fix attempt may prove more troublesome or dangerous than it's worth.
(anderson@redhat.com)
- Fix for "pte" command on s390 and s390x machines if the pte value
argument evaluates as not present. Without the patch, the command
would not display the pte value, but would either print random stack
data (if ASCII), or worse case, cause a segmentation violation.
(anderson@redhat.com)
- Allow command redirection to pipes or files when using gdb commands
alone on the command line without preceding the command string with
"gdb". Without the patch, the pipe/redirection data on the command
line would be appended to the command string passed to gdb, leading
to bizarre results when gdb attempts to evaluate the redirection
pieces of the command string.
(bob.montgomery@hp.com)
- Fix for the processing of bit fields on big endian systems in the
SIAL extension module. Without the patch, bits are not copied to
the correct position and are not shifted the right way.
(holzheu@linux.vnet.ibm.com)
- Fix for "dis -l" to properly display line-number information for
2.6.21 and later x86_64 kernel module text addresses. Without the
patch, a single erroneous file/line-number indication would be
displayed prior to the disassembly output, typically from the file
"include/linux/cpumask.h". This was due to an abnormal text block
descriptor from a function in hpet.c, which starts in the kernel
text segment and extends up into the vsyscall FIXMAP region,
effectively encompassing all kernel module address space.
(john.wright@hp.com)
- Related to the line number patch above, fix to prevent querying the
embedded gdb module for line numbers of kernel module text addresses
if the module's debuginfo data has not been loaded. Without the
patch, the same erroneous file/line-number could be displayed by
commands like "dis -l" or "bt -l" when a module's debuginfo data
has not been loaded, on 2.6.21 and later x86_64 kernels.
(anderson@redhat.com)
- Implemented a new "ps -G" option, which restricts the process status
output to show only the data of the thread group leader of a thread
group. The original request was to avoid the display of redundant
RSS data shared by many threads.
(anderson@redhat.com)
- Several fixes for the "repeat" command when used in conjunction
with an input file. Without the patch:
(1) Depending upon the command executed from the input file, a
a SIGINT would kill the command currently being executed from
the input file, but the "repeat" command would then restart it.
(2) If a command in the input file redirected its output to a pipe,
the repeat operation could stop prematurely after executing
that particular command.
(3) If a command in the input file redirected its output to a pipe,
the zombies of the command being piped to would not be cleaned
up until the repeat command was stopped.
(4) If the last command in the input file redirected its output to a
pipe, all subsequent executions of the input file would only
display the output of that last command.
(anderson@redhat.com)
- Added "trace.c" to the extensions subdirectory, where it will get
built automatically when "make extensions" is run from the top-level
source directory. The trace.so extension module has also been added
to the crash-extensions-<version>.rpm subpackage that is created
by the crash-<version>.src.rpm, which installs extension modules
in the /usr/lib[64]/crash/extensions directory.
(anderson@redhat.com)
- Fix for a potential failure to initialize the kmem slab cache
subsystem on 2.6.22 and later CONFIG_SLAB kernels if the dumpfile
has pages excluded by the makedumpfile facility. Without the patch,
the following error message would be displayed during initialization:
"crash: page excluded: kernel virtual address: <address> type:
kmem_cache_s buffer", followed by "crash: unable to initialize kmem
slab cache subsystem".
(anderson@redhat.com)
- Fix for a potential session initialization failure on x86_64 kernels
if the dumpfile has pages excluded by the makedumpfile facility.
Without the patch, the following error message would be displayed:
"crash: page excluded: kernel virtual address: <address> type:
tss_struct ist array".
(anderson@redhat.com)
- Fix for "kmem -z" option on 2.6.29 and later kernels. Without the
patch, against 2.6.29 and 2.6.30 kernels, the embedded zone VM_STAT
contents would not be displayed after the top line showing the SIZE,
PRESENT, /MIN/LOW/HIGH and FREE page counts; on 2.6.31 kernels, the
command would fail with the error message: "kmem: invalid (optional)
structure member offsets: zone_pages_min or zone_struct_pages_min".
(anderson@redhat.com)
- Fix for "irq" command on 2.6.29 and later CONFIG_SPARSE_IRQ kernels.
Without the patch, the "irq [number]" command would fail on x86_64
with the error message: "irq: x86_64_dump_irq: irq_desc[] does not
exist?", on ia64: "ia64_dump_irq: neither irq_desc or _irq_desc
exist", and on the other architectures: "irq: neither irq_desc nor
_irq_desc symbols exist".
(anderson@redhat.com)
- Fix for the "kmem -i" option on 2.6.31 kernels. Without the patch
the SHARED column may erroneously indicate 0 pages.
(anderson@redhat.com)
- Fix for the "kmem -i" option on 2.6.26 through 2.6.30 x86_64 kernels.
Without the patch, the swap page information would not be displayed,
and the error message "kmem: swap_info[0].swap_map at <address> is
unaccessible" would be displayed.
(anderson@redhat.com)
- Fix for "kmem -p" option on older 64-bit kernels that have a 32-bit
page.flags field. Without the patch, the page.count field in the
page structure would get merged with the page.flags field, and the
result displayed as a 64-bit value in the FLAGS column.
(anderson@redhat.com)
- Fix for "kmem -i" option on older kernels whose unreferenced
page.count value was -1 (instead of 0). Without the patch,
the SHARED column would contain invalid values.
(anderson@redhat.com)
- Change the cursor location when cycling through the command history
when in "vi" editing mode (the default). When using the arrow keys,
or when using CTRL-n and CTRL-p, the cursor will be placed after the
last character in each line, and will be in "insert" mode. When
using ESC followed by j or k, the cursor will be placed on the last
character in the line, and will be in "command" mode. Without the
patch, the cursor would be placed on the first character in the line
regardless of the keys used to cycle through the history.
(anderson@redhat.com)
- 4.0.9 to 4.1.0 incremental patch
(10/07/09)
4.0.9 - Versioning has been changed such that the crash-<version>.tar.gz
file no longer contains a "-" in the <version> number, and the
crash-<version>-0.src.rpm will always have a crash.spec release
number of "0". When the crash binary is built from the src.rpm file,
the "-0" will not be included/displayed as part of the crash binary's
version number, so that it will match the crash binary version that
is built from the crash-<version>.tar.gz file. This is being done
so that distributions can take the crash-<version>.tar.gz file
and append their own crash.spec file release numbering scheme onto
the base <version> number when creating their own src.rpm package.
(anderson@redhat.com)
- Also available in Fedora Rawhide devel branch:
build: dist-f12 crash-4.0.9-2.fc12
http://koji.fedoraproject.org/koji/buildinfo?buildID=131574
- Wholesale replacement of the x86/x86_64 disassembly code in the
embedded gdb-6.1 module to that used in gdb-6.8. The primary motive
is for CONFIG_FUNCTION_TRACER kernels, which contain a 5-byte nopl
instructions that can be overwritten during runtime for dynamic
ftracing. That particular nop format was not recognized by the older
disassembly code in gdb-6.1, and printed a "(bad)" instruction
followed by a incorrect "add" instruction. For example, without the
patch, the instructions at sys_write+11 and sys_write+13 below are
not correct:
crash> dis sys_write
0xffffffff8113c56b <sys_write>: push %rbp
0xffffffff8113c56c <sys_write+1>: mov %rsp,%rbp
0xffffffff8113c56f <sys_write+4>: push %r12
0xffffffff8113c571 <sys_write+6>: push %rbx
0xffffffff8113c572 <sys_write+7>: sub $0x30,%rsp
0xffffffff8113c576 <sys_write+11>: (bad)
0xffffffff8113c578 <sys_write+13>: add %r8b,(%rax)
0xffffffff8113c57b <sys_write+16>: mov %rsi,%r12
...
With the patch, the 5-byte instruction is properly translated:
crash> dis sys_write
0xffffffff8113c56b <sys_write>: push %rbp
0xffffffff8113c56c <sys_write+1>: mov %rsp,%rbp
0xffffffff8113c56f <sys_write+4>: push %r12
0xffffffff8113c571 <sys_write+6>: push %rbx
0xffffffff8113c572 <sys_write+7>: sub $0x30,%rsp
0xffffffff8113c576 <sys_write+11>: nopl 0x0(%rax,%rax,1)
0xffffffff8113c57b <sys_write+16>: mov %rsi,%r12
...
There are other side-effects/changes such as the output of negative
relative offsets from registers. For example, without the patch,
instructions like this:
mov 0xffffffffffffffc8(%rbp),%rdx
are displayed in an easier-to-understand format:
mov -0x38(%rbp),%rdx
There are undoubtedly other subtle changes as well.
(anderson@redhat.com)
- Fix for compressed diskdump/kdump vmcores to properly handle
page descriptor structures that are located beyond a 4GB file
offset in the vmcore file.
(oomichi@mxs.nes.nec.co.jp)
- Fix for x86_64 "bt" command to properly recognize vsyscall FIXMAP
virtual addresses when encountered as the RIP in an exception frame.
Without the patch, the exception frame would be followed by the
warning message: "bt: WARNING: possibly bogus exception frame".
(anderson@redhat.com)
- Fix for the "sym <address>" command option when the address
references a symbol in the vsyscall FIXMAP virtual address page
in certain x86_64 kernel versions. Without the patch, the command
would fail with a "symbol not found" message. This would also affect
commands that perform symbolic translations of virtual addresses,
such as "rd -s".
(anderson@redhat.com)
- Fix for the x86_64 "bt" command that may possibly start the backtrace
of an active non-crashing task on its per-cpu IRQ stack instead of
starting from the NMI exception stack. This could only occur on
a kdump-generated vmcore, and as a result, the backtrace would make
a faulty transition back to the process stack, dump a bogus exception
frame, and display: "bt: WARNING: possibly bogus exception frame".
(anderson@redhat.com)
- Fix for the x86_64 "bt" command in determining the frame just above
an IRQ interrupt exception frame, or above an exception frame that
gets handled on the process stack, such as a page fault. Without
the patch, the frame size of the interrupted function was being
incorrectly calculated, and could result in the display of an invalid
stale frame just above the exception frame register dump.
(anderson@redhat.com)
- Fix for the x86_64 "bt" command's frame size calculating mechanism
to differentiate between text return addresses and the precise text
RIP address of an exception. Without the patch, the instruction of
the text return address location was being incorrectly scanned for
instructions that modify the frame size, and could result in the
skipping of a stack frame.
(anderson@redhat.com)
- Fix for usage of a System.map file argument with 2.6.30 and later
kernels (which only should be done if the vmlinux file does not match
the vmcore or live system being analyzed). Without the patch there
may be several hundred "crash: symbol count overflow (trace_kmalloc)"
messages displayed during the back-patching of the gdb minimal_symbol
table phase.
(anderson@redhat.com)
- Fix for usage of a System.map file argument whose symbol list does
not contain an "_end" symbol. Without the patch, the crash session
fails during initialization with the error message: "crash: cannot
resolve _end".
(anderson@redhat.com)
- Fix for "kmem -p <address>" or "kmem <address>" options when the
<address> is not a page structure address. Without the patch,
starting with crash version 4.0-8.11, harmless but annoying "kmem:
WARNING: sparsemem: invalid section number: 8192" messages would be
displayed as a result of this patch.
(anderson@redhat.com)
- Fix for the snap.so extension module when run on pre-2.6.31 x86_64
kernels with more than 4GB of physical memory. Without the patch,
the resultant vmcore would not include memory above 4GB because
the /proc/iomem file did not display it. A typical crash session
would fail during initialization with an error message such as
"crash: read error: kernel virtual address: 1020009d024 type:
tss_struct ist array".
(anderson@redhat.com)
- Fix for the build of the sial.so extension module if /usr/bin/bison
and /usr/bin/flex do not exist on the host build system. When those
files do not exist, the build of sial.so generates a huge number of
error messages, ending with "make[3]: [sial.so] Error 1 (ignored)".
Since it is preferable to avoid extra BuildRequires entries in the
crash.spec file for extension modules, and given that it is often
built from a tar.gz installation, the failed build will indicate:
"sial.so: build failed: requires /usr/bin/flex and /usr/bin/bison".
(anderson@redhat.com)
- Fix for the build of the snap.so extension module on older systems
running with "make" versions 3.80 or earlier. Without the patch,
the build of snap.so would fail like so:
snap.mk:4: Extraneous text after `else' directive
snap.mk:7: Extraneous text after `else' directive
snap.mk:7: *** only one `else' per conditional. Stop.
make[2]: [snap.so] Error 2 (ignored)
The snap.mk file has been modified to conform to the older format.
(anderson@redhat.com)
- Fix for the "rd" and "vtop" commands on RHEL4 x86_64 Xen paravirtual
kernels in the reading or translation of vmalloc addresses that are
not in kernel module vmalloc address space. In that kernel version
(and none other that I am aware of), the PAGE_OFFSET unity-map kernel
virtual address of 0xffffff8000000000 is larger than the address of
its VMALLOC_START, 0xffffff0000000000. Because of that, without the
patch, "rd" would fail with the error message "rd: invalid user
virtual address: <address> type: 64-bit UVADDR", "vtop" would
fail with the error message "vtop: ambiguous address: <address>
(requires -u or -k)", and "vtop -k" would incorrectly report that
the <address> was "(not a kernel virtual address)".
(anderson@redhat.com)
- Implemented a new "-x" command line option that will automatically
load extension modules from a particular directory. The search for
the extension module directory will be done in the following order,
and the first one (if any) that exists will be selected as the
target directory:
1. the directory specified in the CRASH_EXTENSIONS shell
environment variable
2. /usr/lib64/crash/extensions (64-bit architectures)
3. /usr/lib/crash/extensions
4. ./extensions
All extension modules that are found in the target directory will
be loaded automatically.
(anderson@redhat.com)
- 4.0-8.12 to 4.0.9 incremental patch
(9/10/09)
4.0-8.12 - Fix to support 2.6.30 and later x86 CONFIG_4KSTACKS kernels, where
the hardirq_ctx[] and softirq_ctx[] NR_CPUS-bounded arrays were
replaced with per-cpu variables. Without the patch, the crash
session would fail during initialization with the error message
"crash: cannot resolve: hardirq_ctx".
(oomichi@mxs.nes.nec.co.jp)
- Clean up gdb header files that generate warning messages when
compiling the top-level cmdline.c file with "make warn" or
"make Warn". (anderson@redhat.com)
- If an attempt is made to use an x86 vmlinux file on an x86_64 host,
bail out with a "not a supported file format" error immediately
instead of later on when trying to match the linux_banner string.
(anderson@redhat.com)
- Fix for "bt" command on x86 Xen hypervisor dumpfiles where a vcpu
received a shutdown NMI while running in an interrupt handler.
Without the patch, the backtrace would indicate "bt: cannot resolve
stack trace", and dump the text symbols on the stack.
(anderson@redhat.com)
- Implemented support for the KVM "save-vm" file format, which is
also proposed as the dumpfile format for the "virsh dump" command
for KVM guests.
(pbonzini@redhat.com, anderson@redhat.com)
- Increase NR_CPUS from 512 to 4096 for x86_64.
(caiqian@redhat.com)
- Correct cpu accounting when processors have been taken offline using
a new get_highest_cpu_online() utility function. Without the patch,
commands that have per-cpu displays may not show a cpu's information
and/or may show information for an offline cpu. This patch only
addresses 2.6.30 and later x86_64 kernels in which the per-cpu array
x8664_pda data structures were replaced with per-cpu variables.
(anderson@redhat.com)
- Replace the CFLAGS definition in the Makefile with a CRASH_CFLAGS
definition, which in turn contains ${CFLAGS}. This will allow the
issuing of user-defined CFLAGS on the "make" command line, as is done
according to the Fedora build guidelines.
(lkundrak@v3.sk)
- Fix for a segmentation violation within the embedded gdb module
during session invocation, when running against kernels built with
Fedora gcc version 4.4.0-12 and later (2.6.31-0.62.rc2.git4.fc12
and later Fedora kernels). The gcc update introduced a more
compact Dwarf 3 DW_AT_data_member location, which in turn required
a patch to all versions of gdb.
(lkundrak@v3.sk)
- 4.0-8.11 to 4.0-8.12 incremental patch
(8/11/09)
4.0.8.11-2.fc12 - Fedora release only
- Fix for a segmentation violation within the embedded gdb module
during session invocation, when running against kernels built with
Fedora gcc version 4.4.0-12 and later (2.6.31-0.62.rc2.git4.fc12
and later Fedora kernels). The gcc update introduced a more
compact Dwarf 3 DW_AT_data_member location, which in turn required
a patch to all versions of gdb.
(lkundrak@v3.sk)
- Available in Fedora Rawhide devel branch:
build: dist-f12 crash-4.0.8.11-2.fc12
http://koji.fedoraproject.org/koji/buildinfo?buildID=126403
(8/09/09)
4.0-8.11 - Also available in Fedora Rawhide devel branch:
build: dist-f12 crash-4.0.8.11-1.fc12
http://koji.fedoraproject.org/koji/buildinfo?buildID=125683
- Kdump ELF vmcores contain NT_PRSTATUS notes for online cpus only, so
if cpus have been offlined prior to a crash, there will be fewer
notes than the number of cpus in the system, and therefore there will
not be a one-to-one correlation between each cpu and its associated
NT_PRSTATUS note. That causes backtrace failures for architectures
like ppc64 that depend upon the contents of the NT_PRSTATUS notes for
gathering the starting stack location.
(chandru@in.ibm.com, anderson@redhat.com)
- Fix and enhancement for the "dev" command. When the command was run
against 2.6.26 or later kernels, it would fail with the error message
"dev: invalid structure member offset: char_device_struct_fops".
Additionally, even when the command did work, more often than not it
would fail to determine the file_operations structure associated with
the block or character device, and erroneously display "(none)" or
"(unused)". This patch makes a more comprehensive search for the
file_operations structure, and instead of just displaying its address
and symbolic translation, it will display the address of the data
structure that contains the pointer to the file_operations structure,
along with the symbolic translation of the file_operations structure.
For character devices, the containing structure is a "cdev", and for
block devices the containing structure is a "gendisk". The command
output adds new CDEV and GENDISK columns, and under the OPERATIONS
column is the symbolic translation of its file_operations structure.
(anderson@redhat.com, bob.montgomery@hp.com)
- Fix for a potential segmentation violation when running "foreach bt"
on a very active live system with many processes starting and ending.
Without the patch, a segmentation violation could occur when a "bt"
was attempted on a task that had become non-existent. This would
happen on x86_64 or ppc64 machines, and was due to the usage of a
kernel stack pointer taken from a stale/invalid task_struct. The
command will now recognize the bad stack pointer and display the
error message "bt: task no longer exists" or "bt: invalid/stale
stack pointer for this task: <address>".
(anderson@redhat.com)
- Fix to correctly read LKCD Version 8 and later x86 dumpfile headers.
(talk90091e@gmail.com)
- If a kdump NMI issued to a non-crashing x86_64 cpu was received while
running in schedule(), after having set the next task as "current" in
the cpu's runqueue, but prior to changing the kernel stack to that of
the next task, then a backtrace would fail to make the transition
from the NMI exception stack back to the process stack, with the
error message "bt: cannot transition from exception stack to current
process stack". This patch will report inconsistencies found between
a task marked as the current task in a cpu's runqueue, and the task
found in the per-cpu x8664_pda "pcurrent" field (2.6.29 and earlier)
or the per-cpu "current_task" variable (2.6.30 and later). If it can
be safely determined that the runqueue setting (used by default) is
premature, then the crash utility's internal per-cpu active task will
be changed to be the task indicated by the appropriate architecture
specific value. Also, a new "set -a <task>" option has been added
to manually set a task to be the "active" task on its cpu.
(anderson@redhat.com)
- Fix for x86_64 "bt" command when transitioning from the IRQ stack
back to the process stack on 2.6.29 and later kernels. Without the
patch, the interrupt exception frame address on the process stack
would be incorrectly determined, and its display would typically be
preceded by "[exception RIP: unknown or invalid address]", and the
backtrace would fail from that point on.
(anderson@redhat.com)
- Enhancement to the "runq" command to show the current task in each
cpu's runqueue, plus a few formatting changes to make the output
easier to understand.
(anderson@redhat.com)
- Fix for a memory leak when running on live systems, due to the
repetitive reallocation of the internal array of active tasks.
(anderson@redhat.com)
- Fix for usage with vmlinux debuginfo files using Dwarf 3 format,
for example, the Fedora 2.6.31-0.24.rc0.git18.fc12 kernel. Without
the patch, the crash session fails during initialization with the
error message: "Dwarf Error: wrong version in compilation unit header
(is 3, should be 2) [in module <path-to>/vmlinux]", followed by
the erroneous message "crash: <path-to>/vmlinux: no debugging
data available". The patch simply accepts the Dwarf 3 header, and
the embedded gdb-6.1 version still appears to work with the updated
vmlinux debuginfo file format.
(anderson@redhat.com)
- Fix for faulty invocation failure when a System.map file is used as
an argument with a compressed diskdump or compressed kdump dumpfile.
If the System.map argument appears after the vmcore file on the
command line, as in: "crash vmcore System.map vmlinux", the crash
session fails immediately with the error message: "crash: vmcore:
initialization failed". With the patch, the arguments may be entered
in any order.
(anderson@redhat.com)
- Fix for a potential segmentation violation during invocation if a
vmcore file, a System.map file, and a non-matching vmlinux file are
used as command line arguments. The problem is that whenever a
System.map file is used, it is presumed that the user knows what he
is doing, and that the vmlinux file is not the same as the kernel
that generated the vmcore; therefore the vmlinux/vmcore matching and
verification routines are not performed. However, if the kernel data
structures in the non-matching vmlinux vary widely enough from the
kernel that generated the vmcore, all manners of bogus data may be
read and consumed. The reported segmentation violation occurred when
using a vmcore created from a "stock" Red Hat kernel with a vmlinux
file from a Red Hat "debug" kernel, where the kernel data structures
are significantly different. The patch adds a several new defensive
mechanisms, and displays additional warning messages, when invalid or
questionable data is read, and as a result the crash session will fail
in a more reasonable manner.
(anderson@redhat.com)
- Adjusted several virtual and physical memory address definitions for
2.6.31 x86_64 kernels: MAX_PHYSMEM_BITS, VMALLOC_START, VMALLOC_END,
VMEMMAP_VADDR, VMEMMAP_END, MODULES_VADDR and MODULES_END. Without
the patch, when run against CONFIG_SPARSEMEM_VMEMMAP 2.6.31 kernels,
the "kmem -i" option would hang, and when run against CONFIG_SLUB and
CONFIG_SPARSEMEM_VMEMMAP 2.6.31 kernels, the "kmem -s" option would
report numerous errors indicating "kmem: read error: kernel virtual
address: <address> type: page inuse", where the <address> was
a legitimate virtual-memmap page structure address.
(anderson@redhat.com)
- Improvement for CONFIG_SLUB "kmem -s" or "kmem -S" options when an
invalid slab page link address is encountered. Without the patch,
the commands fail with a generic "invalid kernel virtual address"
read error message, and "kmem -s" would not display any previously
collected statistics. With the patch, the error message displays
the slab cache name, the list type, and the invalid pointer found,
for example, "kmem: dentry: partial list: page.lru.next: 100100".
(anderson@redhat.com)
- 4.0-8.10 to 4.0-8.11 incremental patch
(6/30/09)
4.0-8.10 - Enhancement for currently-existing "mod -S <directory>" option to
make it search for the module debuginfo tree in the same specified,
non-standard, directory tree. When "mod -S" is used without a
specified directory argument, the "<module>.ko" object files are
searched for in the standard "/lib/modules/<release>" directory
tree, and their associated "<module>.ko.debug" are searched for
in the standard "/usr/lib/debug/lib/modules/<release>" directory
tree. Without this patch, "mod -S <directory>" would search the
specified non-standard directory tree for the kernel's "<module>.ko"
files, but the associated "<module>.ko.debug" would not be found.
With the patch, the search for the associated "<module>.ko.debug"
files will be made in the following order and manner:
1. in the same directory containing the "<module>.ko" file.
2. in the ".debug" subdirectory of the directory containing the
"<module>.ko" file.
3. if the "<module>.ko" file was found in a directory pathname
containing the "/lib/modules" component, then the search will be
made in the assocated "/usr/lib/debug/lib/modules" location.
This enhancement will allow an alternate module/module-debuginfo
directory tree to be set up like so:
# cd <directory>
# rpm2cpio kernel-<release>.rpm | cpio -idv
# rpm2cpio kernel-debuginfo-<release>.rpm | cpio -idv
Having done that, the currently-existing "mod -S <directory>"
option will find both the "<module>.ko" and "<module>.ko.debug"
files. In addition, a new "--mod" command-line option may be used
to specify the directory tree:
# crash vmlinux [vmcore] --mod <directory>
When the "--mod <directory>" command line option is used, then
"mod -S" (without a directory argument) will search that directory
tree by default instead of using the standard location.
(anderson@redhat.com)
- Fix to handle the 2.6.29 replacement of the symbols "cpu_online_map",
"cpu_present_map" and "cpu_possible_map" with analogous symbols
"cpu_online_mask", "cpu_present_mask" and "cpu_possible_mask".
Without this patch, crash would fail during initialization on s390
and s390x systems with the error message "crash: cannot resolve
cpu_online_map", or with the error message "crash: PPC64: cannot find
cpu_present_map or cpu_online_map symbols" on ppc64 systems.
(holzheu@linux.vnet.ibm.com, anderson@redhat.com)
- Added several function prototypes for the SIAL extension module file
sial.c because of its inability to #include "defs.h". Without
the patch, the compiler would presume that several un-prototyped
functions would have a return value of int, and therefore would
truncate 64-bit return values into 32-bits.
(holzheu@linux.vnet.ibm.com)
- If by remote chance the panic task cannot be determined from a ppc64
kdump vmcore, a segmentation violation would occur during crash
session initialization.
(anderson@redhat.com)
- An additional directory has been added to the currently-existing
list of directories that the "extend" command searches when the
extension module file is not expressed with a fully-qualified
pathname. The following directories will be searched in the order
shown, and the first instance of the file that is found will be
selected:
1. the current working directory
2. the directory specified in the CRASH_EXTENSIONS shell
environment variable
3. /usr/lib64/crash/extensions (64-bit architectures)
4. /usr/lib/crash/extensions
5. ./extensions
(anderson@redhat.com)
- Fix the "extensions/Makefile" to force a rebuild of extension modules
when the "defs.h" file is newer than the module source.
(anderson@redhat.com)
- Added "snap.c" and "snap.mk" files to the extensions directory. The
new module contains a "snap" command that creates a kdump or netdump
dumpfile from a live system. Currently the x86, x86_64, ppc64 and
ia64 architectures are supported. The snap.so extension module has
been added to the crash-extensions-<release>.rpm, which is created
by the crash-<release>.src.rpm, which installs extension modules
in the /usr/lib[64]/crash/extensions directory.
(anderson@redhat.com)
- Added a set of functions that, for an active task, return a pointer
to the associated register set found in an NT_PRSTATUS note in
netdump and kdump ELF dumpfiles if one exists. They are not used by
the crash source code, but are available to extension modules.
(sharyath@in.ibm.com)
- Use the "crashing_cpu" kernel symbol as a more efficient manner of
determining a kdump x86_64 panic task.
(anderson@redhat.com)
- Fix to handle the replacement of the per-cpu array of x8664_pda data
structures with per-cpu variables in 2.6.30. Without the patch, an
x86_64 crash session would die during initialization with the error
message: "crash: invalid structure size: x8664_pda".
(anderson@redhat.com, nishimura@mxp.nes.nec.co.jp)
- 4.0-8.9 to 4.0-8.10 incremental patch
(5/29/09)
4.0-8.9 - Tentatively scheduled as the baseline version for the RHEL5.4 crash
utility errata release.
- Implemented a new "bt -g" option, which will display the backtraces
of all threads in the targeted task's thread group. The thread
group leader's backtrace will be displayed first, regardless of
which task was the target of the "bt" command.
(anderson@redhat.com)
- Implement support for the kdump "split-dumpfile" format, which can
split /proc/vmcore into multiple dumpfiles as specified by the
"makedumpfile --split" command option. It simply requires that all
of the split dumpfile names be entered on the crash command line.
(tindoh@redhat.com)
- Fix for "kmem -i", "kmem -n" and "kmem -p" on x86_64 CONFIG_SPARSEMEM
and CONFIG_SPARSEMEM_EXTREME kernels that have MAX_PHYSMEM_BITS
increased from 40 to 44. Without the patch, erroneous page-related
data could be displayed depending upon the amount of physical memory
contained by the target system.
(anderson@redhat.com)
- For the architectures that support it, the "--machdep option=value"
command line option has been modified to allow more than one machine-
dependent argument. (anderson@redhat.com)
- The starting backtrace location of active, non-crashing, xen dom0
tasks are not available in kdump dumpfiles, nor is there anything
that can be searched for in their respective stacks. Therefore, for
those those tasks, the "bt" command will indicate: "bt: starting
backtrace locations of the active (non-crashing) xen tasks cannot be
determined: try -t or -T options". Without the patch, the backtrace
would either be empty, or it would show an invalid backtrace starting
at the last location where schedule() had been called.
(anderson@redhat.com)
- Fix for potentially empty "bt -t" output, and for "bt -T" potentially
dumping the text return addresses in the hard or soft IRQ stacks
instead of the process stack. This could occur if the targeted task
was the last task that used the hard or soft IRQ stack (x86 only).
(anderson@redhat.com)
- 4.0-8.8 to 4.0-8.9 incremental patch
(4/16/09)
4.0-8.8 - If a live kernel crash session fails during initialization due to
read errors, and it appears to be because the running kernel was
configured with CONFIG_STRICT_DEVMEM, display this warning message:
"crash: This kernel may be configured with CONFIG_STRICT_DEVMEM,
which renders /dev/mem unusable as a live memory source."
(anderson@redhat.com)
- Fix for the "bt" command to prevent a segmentation violation seen
with an x86_64 Egenera/LKCD dumpfile where the starting stack hooks
for the active tasks in the dumpfile header were nonsensical.
(anderson@redhat.com)
- Fix for the chronological display of the kernel printk buffer data
by the "log" output if the administrator has cleared the buffer
with syslog() or klogctl(). (oomichi@mxs.nes.nec.co.jp)
- Change the message displayed when supplying a non-process stack
address as an argument to "bt -S". Because the supplied address
is typically valid, such as a hard or soft IRQ stack address,
the message will indicate "non-process address" instead of
"invalid stack address". (anderson@redhat.com)
- The crash-<release>.src.rpm will create an additional binary
crash-extensions-<release>.rpm file containing the sial.so and
dminfo.so extension modules. The modules will be installed in the
/usr/lib[64]/crash/extensions directory.
(holzheu@linux.vnet.ibm.com, anderson@redhat.com)
- If a shared-object filename passed to the "extend" command is not
expressed with a fully-qualified pathname, the following directories
will be searched in the order shown, and the first instance of the
file that is found will be selected:
1. the current working directory
2. the directory specified in the CRASH_EXTENSIONS shell
environment variable
3. /usr/lib64/crash/extensions (64-bit architectures)
4. /usr/lib/crash/extensions
The same rules will be applied when unloading shared object files
with "extend -u <shared-object>". Without the patch, only files
in the current directory or those specified with a fully-qualified
pathname were accepted. (anderson@redhat.com)
- Changed the manner in which the "bt" command determines which PID 0
swapper task was interrupted by an ia64 INIT or MCA exception.
There is an existing ia64 INIT/MCA handler bug which incorrectly
writes the pseudo task's command name in its comm[] name string
such that the cpu number may not be part of the string. If that
happens without this patch, the "bt" command fails to make the link
back to the interrupted task, and displays the error message:
"bt: unwind: failed to locate return link (ip=0x0)!"
(anderson@redhat.com)
- Removed an unused initialized variable in get_task_mem_usage().
(junkoi2004@gmail.com)
- Added a debug-level 8 statement in readmem() that will display the
current input address and its translated physical address under the
existing debug-level 4 "<readmem: ...>" debug line, put in place to
aid in debugging read and/or seek errors.
(anderson@redhat.com)
- 4.0-7.7 to 4.0-8.8 incremental patch
(3/20/09)
4.0-7.7 - Also available in Fedora Rawhide devel branch:
build: dist-f11,devel:crash-4.0-7.7.2.f11
http://koji.fedoraproject.org/koji/buildinfo?buildID=83451
build: dist-f11-rebuild,devel:crash-4.0-8.7.2.f11
http://koji.fedoraproject.org/koji/buildinfo?buildID=84905
build: dist-f12-rebuild,devel:crash-4.0-9.7.2.fc12
http://koji.fedoraproject.org/koji/buildinfo?buildID=116824
- Because the ia64 and ppc64 architectures have configurable page
sizes, a host system running a crash session against a dumpfile may
have a different page size than the system that generated the
dumpfile. If the dumpfile is a compressed kdump vmcore or a
diskdump vmcore, the page size will be reset to the dumpfile header's
block_size variable if it does not agree with the host system's page
size. If the dumpfile is a 64-bit kdump ELF vmcore with vmcoreinfo
data that includes the crashing system's page size, that page size
will be used if the architecture is an ia64 or ppc64.
(holt@sgi.com, bwalle.suse.de)
- Fix for "mod -[sS]" command if the target module object filename
contains both underscore and dash characters. Without the patch
the module load would fail with the error message: "mod: cannot
find or load object file for <name> module". Examples are
the "aes_x86_64" module from the "aes-x86_64.ko" object file, and
the "dm_region_hash" module from the "dm-region_hash.ko" object file.
(anderson@redhat.com)
- Reject s390 and s390x "L2^B" local label symbols from the kernel
symbol list. (bwalle@suse.de)
- Enlarge the string format buffer in the show_last_run() function to
prevent a buffer overflow when running "ps -l".
(sachinp@in.ibm.com)
- Fix for "bt -a" to continue with the backtraces of the remaining
active tasks when one of them encounters a fatal error. Without
the patch, the command is aborted when any of the backtraces fail.
(anderson@redhat.com)
- Only allow trusted versions of .crashrc and .gdbinit files to be
sourced during session initialization. (anderson@redhat.com)
- Fix for a potential but highly unlikely buffer overflow in the gdb
dwarfread.c and dwarf2read.c files, which requires a hand-crafted
object file with a location block (DW_FORM_block) that contains a
large number of operations. (anderson@redhat.com)
- Fix for a potential but highly unlikely integer overflow in the
Binary File Descriptor (BFD) library, which requires a hand-crafted
object file that that specifies a large number of section headers,
leading to a heap-based buffer overflow. (anderson@redhat.com)
- Enable stack unwind on ia64 when using a kerntypes file as the
kernel namelist. (cpw@sgi.com)
- Fix for failure of "files -R" command option if an inode is unknown
due to a NULL f_dentry pointer in any open file structure because of
a kernel error condition. Without the patch, the command aborts
prematurely with the error message: "files: invalid input: ?".
(anderson@redhat.com)
- Allow an LKCD kerntypes debuginfo file created from a kernel module
to be loaded with the command: "mod -s <module> <kerntypes-file>".
(cpw@sgi.com)
- Increased NR_CPUS from 256 to 512 for x86_64, and from 128 to 1024
for ppc64. Made several NR_CPUS-bound static arrays in the internal
task_table and kernel_table structures dynamically allocated only
upon demand. (anderson@redhat.com)
- 4.0-7.6 to 4.0-7.7 incremental patch
(2/06/09)
4.0-7.6 - Fix for initialization-time failure if the kernel was built without
CONFIG_SWAP. Without the patch, it would fail during initialization
with the error: "crash: cannot resolve: nr_swapfiles"
(anderson@redhat.com)
- Fix for the "bt" command when run on x86_64 kernels that contain the
x86/x86_64 merger patch. Without the patch, non-active (blocked)
tasks do not start with "schedule", and as a result may contain
stale frame entries. (anderson@redhat.com)
- Fix for the usage of an input file of commands redirected during
runtime via "<", where more than one command in the input file
results in a fatal error. Without the patch, the handling of the
input file would go into an infinite loop repeatedly running the
second failed command. (anderson@redhat.com)
- Clean up causes for warning messages when compiling with gcc 4.3.2.
(anderson@redhat.com)
- Fix to prevent a segmentation violation during initialization when
parsing (corrupted) module symbols. Without the patch, if a kernel
module's Elf32_Sym/Elf64_Sym data structure contains a corrupt
"st_index" field, the resultant string table access could cause a
segmentation violation. (anderson@redhat.com)
- If an active task experiences a kernel stack overflow, the task's
thread_info structure located at the very bottom of the stack will
likely have its "cpu" field corrupted. Without the patch, any task
with a corrupt cpu value is not accepted, and the error message
"crash: invalid task: <task-address>" is displayed. With the
patch, an active task will be accepted based upon its existence as
the current task in a per-cpu runqueue structure, and there will be
a warning message indicating that the cpu value is corrupt.
(anderson@redhat.com)
- Modification of the the "files" command when a task has an open file
referenced by a file descriptor, but the file structure's f_dentry
field is NULL. This is a kernel error condition, but without this
patch the "files" command does not display anything for that file
descriptor, as if the file has been closed or is not in use. This
patch displays the file descriptor number and the file structure's
virtual address. (anderson@redhat.com)
- Fix for the "bt" command on x86 Xen architectures when the backtrace
starts on the hard IRQ stack. Without the patch, the backtrace
may not properly make the transition back to the process stack
with the error message "bt: invalid stack address for this task",
or it may cause a segmentation violation. (anderson@redhat.com)
- 4.0-7.5 to 4.0-7.6 incremental patch
(1/09/09)
4.0-7.5 - Fix for "kmem -i" and "kmem -p" on 2.6.26 x86 CONFIG_SPARSEMEM
PAE kernels to account for the change in value of SECTION_SIZE_BITS.
(oomichi@mxs.nes.nec.co.jp)
- Fix for "bt -[tT]" options on x86 architectures when the backtrace
starts on the hard IRQ stack. Without the patch, the backtrace
may not properly make the transition back to the process stack.
(anderson@redhat.com)
- Fix for the "bt" command when run on a xen hypervisor in which the
backtrace leads to either "process_softirqs" or "page_fault".
Without the patch, the backtrace indicates: "bt: cannot resolve stack
trace", and then the recovery code terminates the command with the
nonsensical error message: "bt: invalid structure size: task_struct".
(oda@valinux.co.jp, anderson@redhat.com)
- Fix for the "kmem -[sS]" options that could cause a segmentation
violation or bogus "bad slab pointer" and "bad inuse counter" error
messages. Reported on 2.6.25-based CONFIG_DEBUG_SLAB kernels, but
could conceivably occur on any kernel with a kmem_cache.nodelists[]
array. (anderson@redhat.com)
- Fix for a bug in the SIAL extension when dealing with bitfields.
(olaf@sgi.com, hedi@sgi.com)
- Fix for the "files" command when run on 2.6.25 and later kernels,
which would either fail with an "invalid kernel virtual address"
error of type "fill_dentry_cache", or would show nonsensical/garbage
"ROOT" and "CWD" pathnames. This was due to the change in format
of the kernel's fs_struct. (anderson@redhat.com)
- Addition of a new "null-stop" environment variable that can be turned
on/off with the "set" command. It simply controls the embedded gdb's
"null-stop" print setting, which, if on, will stop printing character
arrays when the first NULL is encountered. The default setting is
still "off", so there will be no behavioral changes unless it is
turned on during runtime or in .crashrc files.
(anderson@redhat.com)
- Fix for the builtin "g" alias, which would fail with an "Ambiguous
command" error from the embedded gdb module.
(anderson@redhat.com)
- Fix to handle the 2.6.27 kernel's change of the module structure's
num_symtab, core_size and core_text_size members from long to int.
Without the patch, initialization-time failures would result when
running against 64-bit big-endian kernels, and potentially on little-
endian 64-bit kernels. (bwalle@suse.de)
- Implement support for the /dev/crash driver being built into x86 or
x86_64 Red Hat kernels with the restricted /dev/mem driver. Without
the patch, if the kernel was built with CONFIG_CRASH configured as
"y" instead of "m", and crash was run against the resultant live
kernel, it would fail during initialization attempting to use the
restricted /dev/mem device. (anderson@redhat.com)
- If the /dev/crash driver module has been loaded prior to a live crash
session, then it will not be unloaded when the crash session exits.
Normally the module gets loaded by the crash utility during its
initialization on a live system, and then unloaded when the crash
session exits, regardless whether the module was loaded by the crash
utility itself or if it was pre-loaded manually. However, if a cpu
subsequently hangs, then a live crash session attempt would also hang
when it tries to load the module. This patch will allow the crash.ko
module to be pre-loaded -- for example during kernel boot-time -- and
if a cpu subsequently hangs, a live crash session can be initiated to
investigate the problem. (anderson@redhat.com)
- Fix to recognize the 2.6.25 re-naming of the x86 user_regs_struct
structure members. Without the patch, running against a kdump
dumpfile would fail with the error: "crash: invalid structure member
offset: user_regs_struct_ebp". (anderson@redhat.com)
- Fix for initialization-time failure when running against 2.6.27
x86_64 xen kernels, which indicate "crash: cannot resolve: end_pfn".
(bwalle@suse.de)
- Fix for initialization-time failure when running against Xen 4.4
hypervisor binaries, which indicate "crash: invalid structure member
offset: domain_is_polling". (bwalle@suse.de)
- Added a new "p -u" option, which indicates that the gdb expression
argument evaluates to a user virtual address in the current context.
This option could be used, for example, if a known kernel data
structure exists at user virtual address in the current context,
or if the debuginfo data of a user program were loaded into the
crash session via the gdb "add-symbol-file" command.
(anderson@redhat.com)
- Fix for "bt -a" command when running against the xen hypervisor where
the number of physical cpus outnumber the MAX_VIRT_CPUS value for the
processor type. Without the patch on such a system, "bt -a" would
fail after displaying backtraces for the first 32 (MAX_VIRT_CPUS)
pcpus with the the error message: "bt: invalid vcpu". The patch also
corrects the "vcpus" command output to show the vcpus associated with
pcpus 32 through 63, and the "doms" command output to show the second
idle domain associated with pcpus 32 through 63.
(oda@valinux.co.jp)
- Fix for the display of the processor speed on IBM Power6 hardware.
Without the patch, "MACHINE: ppc64 (unknown Mhz)" would be displayed
upon initialization and by the "sys" command.
(sachinp@in.ibm.com, acv@linux.vnet.ibm.com)
- 4.0-7.4 to 4.0-7.5 incremental patch
(12/05/08)
4.0-7.4 - Fix for a build regression for non-xen architectures introduced in
version 4.0-7.3. The ppc64, s390 and s390x architectures fail to
compile due to an undefined reference to "xen_hyper_print_bt_header".
(bwalle@suse.de)
- 4.0-7.3 to 4.0-7.4 incremental patch
(10/14/08)
4.0-7.3 - Fix for nonsensical usage of the "set" command when running
against the xen hypervisor binary. If entered alone on the
command line, the command would cause a segmentation violation,
because there is no concept of a "context" in the xen hypervisor.
In addition, more reasonable error messages are displayed if
"set", "set -c <cpu>", "set -p", or "set <address>" are
attempted when running against a xen hypervisor.
(anderson@redhat.com)
- Fix for "bt" command on x86 architectures when the backtrace
starts on the hard IRQ stack. Without the patch, the backtrace
may not properly make the transistion back to the process
stack, and therefore not display the interrupt exception frame
or any kernel functions leading up to the interrupt.
(anderson@redhat.com)
- Fix for "search -k" option on some ia64 hardware, depending
upon the underlying physical memory layout. Without the patch
the command could fail prematurely with the error message:
"search: ia64_VTOP(a000000200000000): unexpected region 5 address".
(anderson@redhat.com)
- Fixes for the "bt" command when running against the xen hypervisor
binary. The "bt -o" option, and setting it to run by default with
"bt -O", would fail with the vmlinux-specific error message "bt:
invalid structure size: desc_struct" with a stack trace leading
to read_idt_table(); with the patch it will display the generic
error message "bt: -o option not supported or applicable on this
architecture or kernel". The "bt -e" or "bt -E" will also display
the same error message, as opposed to the command usage message.
Lastly, the "bt -R" option would cause a segmentation violation;
it has been fixed to work as it was designed.
(anderson@redhat.com)
- The "foreach" command has been removed from the set of commands
supported for usage with the xen hypervisor. If attempted, it
would always silently fail. (anderson@redhat.com)
- Fix for "irq -d" option when run on x86_64 xen kernels. Without the
patch it would indicate: "irq: invalid structure size: gate_struct"
and dump a stack trace leading to x86_64_display_idt_table(). Now it
will indicate that the -d option is not applicable.
(anderson@redhat.com)
- Avoid the symbolic translation of ia64 unity-mapped region 7 kernel
virtual addresses as they are displayed by the "bt -r" and "rd -[sS]"
commands. Without the patch, they are shown as "v+<offset>"
because "v" is an absolute symbol equal to 0xe000000000000000.
(anderson@redhat.com)
- Remove redundant storage of "swapper_pg_dir" symbol value during x86
initialization. (junkoi2004@gmail.com)
- Recognize the removal of the "jiffies" variable when running against
newer versions of the xen hypervisor by indicating "--:--:--" next
to the UPTIME display. (oda@valinux.co.jp)
- Fix to determine whether an x86 or x86_64 xen hypervisor was built
with PERCPU_SHIFT value of 12 or 13. Without the patch, crash
sessions running against a xen-3.3 hypervisor would fail during
initialization with the error message: "crash: cannot read elf note
core." (oda@valinux.co.jp)
- 4.0-7.2 to 4.0-7.3 incremental patch
(10/10/08)
4.0-7.2 - Fix for initialization-time failure when running against 2.6.27
x86_64 kernels, which indicate "crash: cannot resolve: end_pfn".
The patch sets the new 2.6.27 x86_64 PAGE_OFFSET value, handles
the change in the x86_64 "_cpu_pda" variable declaration, and
distinguishes paravirtual "pv_ops" kernels from traditional xen
kernels. (oomichi@mxs.nes.nec.co.jp, anderson@redhat.com)
- When an improper structure member offset or structure size is
attempted, a partial crash backtrace is displayed in the ensuing
error message. However, if the crash binary was stripped, it would
show "/usr/bin/nm: /tmp/crash: no symbols" instead of the address
and name of the symbol. This has been fixed to work with stripped
binaries if the crash symbol can be found in the crash binary; if
the crash symbol cannot be found, such as for static text symbols,
it will just display its address and "(undetermined)".
(bwalle@suse.de)
- crash.spec file addition: Requires: binutils
(anderson@redhat.com)
- Fix for LKCD kerntypes debuginfo files to use "node_states" when
"node_online_map" is not in use. (cpw@sgi.com)
- Implement support for s390/s390x CONFIG_SPARSEMEM kernels. Without
the patch, crash sessions would fail during initialization with the
error message: "crash: CONFIG_SPARSEMEM kernels not supported for
this architecture". (holzheu@linux.vnet.ibm.com)
- Fix for "kmem -[sS]" when running against 2.6.27 CONFIG_SLUB kernels,
in which the kmem_cache.objects and .order members were replaced by
a kmem_cache_order_objects structure. Without the patch, the command
would fail with the error message: "kmem: "invalid structure member
offset: kmem_cache_objects". The fix also recognizes and supports
potentially variable slab sizes as introduced by the kernel patch.
(anderson@redhat.com)
- Increased the maximum number of SIAL commands from 100 to 200.
(cpw@sgi.com)
- 4.0-7.1 to 4.0-7.2 incremental patch
(9/15/08)
4.0-7.1 - Fix to address RT kernel's renaming of the address_space.nrpages
member to address_space.__nrpages. Without the patch, "kmem -i"
would fail with the error message "kmem: invalid structure member
offset: address_space_nrpages". (bwalle@suse.de)
- For crash utility debug backtraces displayed in error conditions,
the usage of __builtin_return_address() has been replaced with the
backtrace() function. This prevents crashes if the Makefile is
modified to compile with -O2. (bwalle@suse.de, anderson@redhat.com)
- Fix for ia64 hypervisor backtraces when the entries in the cpu map
are not contiguous. (takebe_akio@jp.fujitsu.com)
- Fix to make shell-escaped commands in a crash input file direct
their output properly. Without the patch, if the output of an
input file was redirected to a file or pipe, the output of any
shell-escaped commands in the input file only went to stdout.
(anderson@redhat.com)
- Fix to allow the usage of the "-i inputfile" command line option
when operating from an init script. Without the patch, the crash
session would fail during initialization with the error message:
"crash: /dev/tty: No such device or address". (anderson@redhat.com)
- Fix for the "kmem -P <address>" option, where <address> is an
invalid physical address. Without the patch, the command causes
a segmentation violation on an ia64; on other architectures an
unnecessary mem_map header is displayed prior to the error message.
(wency@cn.fujitsu.com)
- Fix for a potential endless cascade of SIGFPE exceptions during
session initialization when a vmlinux and vmcore do not match,
and a correct System.map or a non-debug vmlinux file is not supplied.
Doing that is is allowable, but is certainly not recommended. In
this case, and incorrect kernel HZ value of 0 was calculated and used
for the initial "UPTIME:" display. (anderson@redhat.com)
- More gracefully handle a nonsensical "search -u <address>" command
attempt on a kernel thread or any context with no user address space.
Without the patch, the error message was related to a failed user
virtual address translation attempt; with the patch it now indicates:
"search: current context has no user address space".
(anderson@redhat.com)
- Reworked the "search" command for usage with the Xen Hypervisor.
When attempted on a Xen hypervisor, and depending upon the arguments
used, a segmentation violation, a nonsensical error message, or if
neither of the aforementioned, the command could appear to work but
not necessarily find the search target value even though it was there
in the specified memory range. To address the various shortcomings,
the following restrictions have been put in place for usage with the
Xen hypervisor:
(1) A starting virtual address must be supplied either symbolically
or with the "-s <address>" option.
(2) The (nonsensical) "-u" option is no longer accepted.
(3) The "-k" option is no longer accepted.
(4) When cycling through virtual memory, as soon as an address
cannot be read, the search will be quietly suspended. To
determine where the search was suspended, enter "set debug 1",
and then re-run the command.
(anderson@redhat.com)
- Fix for initialization-time segmentation violation due to a module
allocating and creating an exported symbol list outside of its own
virtual address space, and then overwriting its own symbol list
pointer. (anderson@redhat.com)
- Implementation of a "--minimal" command line option, which brings
up a crash session that is restricted to the "log", "dis", "rd",
"sym", "eval" and "exit" commands. This option may provide a way to
extract some minimal/quick information from a corrupted or truncated
dumpfile, or in situations where one of the several kernel subsystem
initialization routines, which are not called, would abort the
crash session. (sharyath@in.ibm.com, sachinp@in.ibm.com)
- 4.0-6.3 to 4.0-7.1 incremental patch
(8/19/08)
4.0-7 - License tag in crash.spec changed from "GPL" to "GPLv2"; otherwise
identical to 4.0-6.3. (spot@fedoraproject.org)
- Available only in Fedora Rawhide devel branch:
build: dist-f10,devel:crash-4.0-7
http://koji.fedoraproject.org/koji/buildinfo?buildID=56268
(7/15/08)
4.0-6.3 - Support for Fedora FC9 kernels containing the linux-2.6.utrace.patch,
which removes the task_struct.parent member. Without the patch, the
crash session fails during initialization with the error message:
"crash: invalid structure member offset: task_struct_parent".
(anderson@redhat.com)
- Available in Fedora Rawhide devel branch:
build: dist-f10,devel:crash-4.0-6.3
http://koji.fedoraproject.org/koji/buildinfo?buildID=47600
- Further scalability improvements to the "search -k" mechanisms.
(anderson@redhat.com)
- Changed ppc64 manner of determining the number of cpus to first check
the cpu_present_map, and only if that doesn't exist, continue to use
the cpu_online_map. Without the patch, depending upon which cpus
were offline, crash sessions could fail during initialization with
the error message: "crash: cannot determine idle task addresses from
init_tasks[] or runqueues[]". (anderson@redhat.com)
- Fix/workaround for the ppc64 "bt" command on panic/active tasks when
run against dumpfiles whose kernel had crashed with one or more
cpus offline. Without the patch, the "bt" command could cause a
segmentation violation, or fail because the starting stack location
and instruction pointer were invalid. With the patch, an error
message will be displayed, indicating that the NT_PRSTATUS note for
that task could not be determined. (anderson@redhat.com)
- Added support for vtop translation of 1MB large pages available on
new z10 (s390x) systems. (holzheu@linux.vnet.ibm.com)
- Prevent misleading init-time warning message for s390/s390x when
verifying the vmlinux file with respect to the host machine type.
Without the patch, this message would appear when running on s390
or s390x machines: "WARNING: machine type mismatch: crash utility:
S390X /usr/lib/debug/lib/modules/2.6.18-86.el5/vmlinux: (unknown)"
(holzheu@linux.vnet.ibm.com)
- Minor documentation fix to crash.8 man page, moving the "wr" command
from being munged into the "whatis" description into its own list
entry. (yamato@redhat.com)
- Support for running against an x86 xen-syms hypervisor binary based
upon xen 3.1.2 or later. Without the patch, the session would fail
to recognize that it was PAE, and "bt" commands on the non-active
task would fail with the error messages "bt: cannot resolve stack
trace" and "bt: invalid structure size: task_struct".
(oda@valinux.co.jp, anderson@redhat.com)
- Support for running against an x86_64 xen-syms hypervisor binary
based upon xen 3.1.2 or later. Without the patch, the session would
fail during initialization with the error message: "crash: cannot
resolve idle_pg_table_4". In addition, the x86_64 xen-syms
hypervisor is now relocatable, but the kdump vmcore does not
(currently) export the base physical address of the relocated
hypervisor text and static data. Without that knowledge, the crash
utility cannot make virtual to physical address translations, and
therefore cannot navigate through the vmcore. To address that
shortcoming, a patch is required for either the xen hypervisor code
or the kexec-tools package to export the value of the hypervisor's
"xen_phys_start" symbol to the vmcore. Until such time, however, a
workaround has been put in place to pass the value with a new command
line option that is invoked like so:
# crash --xen_phys_start <address> xen-syms vmcore
The value of the xen_phys_start <address> argument can be
determined in two ways, either from /proc/iomem on the live
system running the dom0 kernel that generated the kdump, or by
running crash on the target vmcore using the dom0 vmlinux file.
For example, on this system, the <address> argument would be
3ee00000:
# cat /proc/iomem | grep Hypervisor
3ee00000-3fdfffff : Hypervisor code and data
#
Alternatively, the vmcore file in this example indicates that the
<address> argument would be 0x3f000000:
# crash vmlinux vmcore
...
crash> px xen_hypervisor_res
xen_hypervisor_res = $3 = {
start = 0x3f000000,
end = 0x3fffffff,
name = 0xffffffff8049ab72 "Hypervisor code and data",
flags = 0x80000200,
parent = 0xffff880000001180,
sibling = 0x0,
child = 0xffff8800000000a8
}
If the --xen_phys_start command line option is not used, the session
will fail during initialization. However there will be a warning
message preceding the failure indicating: "WARNING: This hypervisor
is relocatable; if initialization fails below, try using the
--xen_phys_start <address> command line option". Eventually the
value of the hypervisor's "xen_phys_start" will be passed in the
vmcore header, obviating the need for this workaround.
(oda@valinux.co.jp, anderson@redhat.com)
- 4.0-6.2 to 4.0-6.3 incremental patch
(4/30/08)
4.0-6.2 - Implemented a new "rd -S" option which, like the "-s" option,
displays the symbolic translation of kernel virtual addresses,
but also recognizes the virtual addresses of slab objects, and when
found, the address is replaced by the kmem_cache slab name string
inside brackets. (anderson@redhat.com)
- Make the found address displayed by "kmem -[sS] <address>" be the
address of the containing object if the <address> argument is
offset from the beginning of the object. This only applies to
kernels using kernel/slab.c; CONFIG_SLUB kernels currently do display
the address of the containing object.
(anderson@redhat.com)
- Fix for "kmem -[sS] [address]" in 2.6.25 CONFIG_SLUB kernels, which
address changes in the kernel's per-slab free list tracking. Without
the patch, error messages of the type "kmem: invalid kernel virtual
address: 10700 type: get_freepointer" would be seen when the full
list of objects in a per-cpu slab was displayed.
(anderson@redhat.com)
- Fix for "kmem -[sS] <slab-address>" in 2.6.25 CONFIG_SLUB kernels,
in which the slab structure is actually a page struct. Some slab
addresses would not be recognized as such, and therefore without the
patch, error messages of the type "kmem: address is not allocated in
slab subsystem: <slab-address>" would be seen.
(anderson@redhat.com)
- Fix for an initialization-time failure with Ubuntu kernels because
of a mismatch between the /proc/version string and the linux_banner
string, due to additional information appended to the linux_banner
string in Ubuntu kernels. (anderson@redhat.com, asid@hp.com)
- Fix for the "net" command in 2.6.22 and 2.6.23 kernels, where the
"dev_base" net_device structure was replaced by the "dev_base_head"
list_head. Without the patch, the "net" command with no arguments
would fail with the error message: "net: dev_base does not exist!".
(eteo@redhat.com)
- Fix for the "net" command in 2.6.24 and later kernels where the
global "dev_base_head" list_head has been removed, and the network
devices are linked from the "init_net" net structure. Without the
patch, the "net" command with no arguments would fail with the
error message: "net: dev_base does not exist!".
(anderson@redhat.com)
- For kernels configured with CONFIG_SLUB, "kmem -S" has been updated
to properly differentiate whether a cache's "full" slabs are tracked
but whose full list is empty, or whether the full slabs are not
tracked at all. Without this patch, a cache's full list could be
indicated as "(empty)" instead of the more correct indication of
"(not tracked)". (i-kitayama@ap.jp.nec.com, anderson@redhat.com)
- Fix for the "vm" command when the crash session was invoked with
the -s command line option. Without the patch, if invoked prior to
a "set", "ps" or "vtop" command, the "vm" command run against a
task other than the initial context would mistakenly indicate that
the task contained no virtual memory.
(anderson@redhat.com, baiwd@cn.fujitsu.com)
- Fix/workaround for the "search -k" command option on relocatable
2.6-era ia64 machines configured with CONFIG_SPARSEMEM. Without
the patch, an immediate segmentation violation occurs.
(anderson@redhat.com, yzgcsu@cn.fujitsu.com)
- 4.0-6.1 to 4.0-6.2 incremental patch
(3/31/08)
4.0-6.1 - Support for 2.6.25 x86_64 kernels with the x86/x86_64 merger patch.
Without the patch, attempting a crash session would fail during
initialization with the error message: "crash: invalid structure
member offset: tss_struct_ist". (anderson@redhat.com)
- Support for 2.6.25 x86 kernels with the x86/x86_64 merger patch.
Without the patch, attempting a crash session on a dumpfile would
fail during initialization with the error message: "crash: invalid
structure size: user_regs_struct". (anderson@redhat.com)
- Fix for "bt" command when running on a live 2.6.25 x86 kernel with
the x86/x86_64 merger patch. Without the patch, "bt" would fail
with the error message: "bt: invalid structure member offset:
task_struct_thread_eip". (anderson@redhat.com)
- Fix for the "timer" command in 2.6.25 kernels. Without the patch
the command would fail with the error message: "timer: zero-size
memory allocation! (called from <user address>)".
(anderson@redhat.com)
- Cosmetic change to the x86 "bt" command to recognize the entry point
name change from "sysenter_entry" to "ia32_sysenter_target". Without
the patch, the entry point would indicate the "sysenter_past_esp"
assembly code label. (anderson@redhat.com)
- 4.0-6.0 to 4.0-6.1 incremental patch
(2/29/08)
4.0-6.0 - Available only as version 4.0-6.0.5 in Fedora's dist-f9/devel branch:
http://koji.fedoraproject.org/koji/buildinfo?buildID=37614
- When compiling within a 2.6.25-based build environment, four
"typedef unsigned int u32;" declarations are required due to a new
structure declaration in "asm-x86/ptrace-abi.h" that uses u32
members, but u32 is only defined in "asm-x86/types.h" within an
#ifdef __KERNEL__ section. I posted a patch on LKML to address the
ptrace-abi.h problem by changing the structure member declarations
to use __u32 typedefs, which was accepted in the -mm tree.
(anderson@redhat.com)
- 4.0-5.1 to 4.0-6.0 incremental patch
(2/20/08)
4.0-5.1 - Update "ps -l" to use task_struct.sched_info.last_arrival value
on 2.6.23 and later kernels that don't have a task_struct.last_ran
member. Without the patch, the option would fail with the error
message: "ps: neither task_struct.last_run nor task_struct.timestamp
exist in this kernel". (anderson@redhat.com)
- Fix for potential initialization-time failure when running against
2.4-era x86 netdump dumpfiles if the ebp and esp contents in the
ELF header's NT_PRSTATUS register dump do not contain a vestige of
the panic task's kernel stack address. Without the patch, there may
be one or more warning messages complaining about tasks not being in
the PID hash, followed by a fatal error message: "crash: invalid
kernel virtual address: <bad-address> type: 32-bit KVADDR", where
the <bad-address> can be any bogus kernel virtual address.
(anderson@redhat.com)
- Fix to make the unused do_radix_tree() function work as advertised.
(atyson@hp.com)
- Added zlib-devel to the crash-devel package-dependency Requires line
in the crash.spec file. (anderson@redhat.com)
- 4.0-5.0 to 4.0-5.1 incremental patch
(2/19/08)
4.0-5.0 - Tentatively scheduled as the baseline version for RHEL4.7 and RHEL5.2
crash utility errata releases; also built in Fedora Rawhide:
4.0-5.0.0 - RHEL4.7 errata version
4.0-5.0.2 - RHEL5.2 errata version
4.0-5.0.3 - Fedora Rawhide (devel branch)
- Fix for a potential segmentation violation during crash session
initialization if a task's kernel stack has been completely overrun,
corrupting its thread_info structure at the bottom of the stack.
This could occur running against kernels from 2.6.8 through 2.6.18.
With the patch, the suspect task will be reported during the task
initialization sequence. (anderson@redhat.com)
- Fix for the "bt" command when run on xen x86 dom0 dumpfiles, which
may potentially show empty backtraces for one or more active tasks.
(oomichi@mxs.nes.nec.co.jp)
- Initial support for OpenVZ kernels. (kshileev@sw.ru)
- 4.0-4.13 to 4.0-5.0 incremental patch
(1/17/08)
4.0-4.13 - If the vmlinux file or dumpfile is a machine type mismatch with
the crash utility binary, or far less likely, a ppc64 or ia64
endian mismatch, the crash session will fail during initialization
with the generic error message, "crash: <filename>: not a
supported file format". To aid the user in understanding what
caused the failure, this patch prepends an additional error
message that clarifies the reason behind the mismatch.
(anderson@redhat.com, bwalle@suse.de)
- An update for "kmem -V" option, which currently displays the kernel's
"vm_stat" counter values, will now also display the "vm_event_states"
counter values, both of which were introduced in 2.6.18. For 2.6
kernels prior to 2.6.18, the precursor "page_states" counter values
will be displayed. (anderson@redhat.com)
- Implemented a new "kmem -z" option to display per-zone memory
statistics. The amount of data displayed is dependent upon the
kernel version. At a minimum, the size, min/low/high and free
page counts are shown. If the zone struct contains nr_active,
nr_inactive, pages_scanned and all_unreclaimable members, those
fields are shown. If the zone struct contains a per-zone vm_stat[]
array (identical to the system-wide vm_stat[] array), its contents
are dumped. For any other data in the zone, the address of the
zone structure is displayed.
(anderson@redhat.com)
- Fix for the RSS amounts displayed by the "ps" and "vm" commands
on 2.6 kernels prior to 2.6.13. (anderson@redhat.com)
- Fix for the x86 "bt" command when running a version of crash built
on a pre-2.6.20 host against a 2.6.20 or later dumpfile, or when
running a version of crash build on a 2.6.20 or later host against
a pre-2.6.20 dumpfile. Without the patch, kernel exception frames
would be mistaken for, and displayed as, user exception frames, and
parts of the backtrace above the kernel exception frame would be
truncated. (atyson@hp.com)
- Fix for FC8 xen x86 kernels (2.6.21-2952.fc8xen) that fail during
initialization after reporting "WARNING: cannot read linux_banner
string", followed by the fatal error message "crash: vmlinux and
vmcore do not match!". This required a change to the virtual
address mask value used to determine the base value of the x86
kernel's unity-mapped virtual address region. (anderson@redhat.com)
- Set a default "phys_base" value for recent fully-virtualized
relocatable x86_64 kernels whose text start address is not equal
to the __START_KERNEL_map value. Without the patch, the crash
session fails during initialization with the warning message
"WARNING: cannot read linux_banner string", followed by the fatal
error message "crash: vmlinux and vmcore do not match!". The
error can alternatively be worked around if the "phys_base" value
is first determined by running a crash session on the live system
that generated the dumpfile, by entering: "help -m | grep phys_base".
The value shown can then be used when running against the dumpfile
like so: "crash --machdep phys_base=<value> vmlinux vmcore"
(anderson@redhat.com)
- Debug: implemented a new "--active" crash command line option, which
will gather only the active tasks from each runqueue, skipping the
traversal of the kernel's pid_hash mechanism.
(anderson@redhat.com)
- Debug: "help -n" formats and displays ASCII VMCOREINFO data.
(anderson@redhat.com)
- 4.0-4.12 to 4.0-4.13 incremental patch
(1/11/08)
4.0-4.12 - Fix for the "kmem -n" command to handle the 2.6.24 kernel replacement
of the "node_online_map" nodemask with its appropriate entry in the
new "node_states[]" nodemask array. Without the patch, the per-node
zone data would not be displayed, and any commands depending upon
the node table data would be affected. (anderson@redhat.com)
- Fix for "kmem -p" on 2.6.24 x86_64 kernels that are configured with
CONFIG_SPARSEMEM_VMEMMAP, which use a virtually-mapped page struct
array. Without the patch, the virtual-to-physical translation of
each page structure was invalid, and "kmem -p" would display invalid
data. This would also affect other commands as well, such as the
output of "kmem -i", and the output of a "vtop" command on a mapped
page address. Also, the virtual base address of the region is now
displayed by the "mach" command.
(oomichi@mxs.nes.nec.co.jp, anderson@redhat.com)
- Fix for the "dev" command's character device name string output to
recognize the change of the name structure member from a pointer
to an embedded string. Without the patch, 2.6.16 and later kernels
would display "(unknown)" character device names.
(olivier.daudel@u-paris10.fr, anderson@redhat.com)
- Fix for the "kmem -[sS]" command to handle the 2.6.24 change to
the CONFIG_SLUB kmem_cache structure, which re-worked the manner
in which the per-cpu slabs get referenced. Without the patch,
the command would fail with several error messages of the type:
"kmem: page_to_nid: invalid page: ffff81003993f4b0".
(anderson@redhat.com)
- Fix for the "kmem -[fF]" command to handle the 2.6.24 kernel change
of the free_area struct, which replaced the singular linked list
of pages with 5 (MIGRATE_TYPES) linked lists. Without the patch,
the command would fail with the error message: "kmem: unrecognized
free_area struct size: 88". (anderson@redhat.com)
- Fix for the "runq" command to handle the 2.6.24 kernel change to
the CFS scheduler that introduced per-cpu init_cfs_rq structures
for task group scheduling. Without the patch, no queued tasks
were displayed, because the rb_root of queued tasks was being
taken from the embedded cfs_rq in each per-cpu runqueue.
(anderson@redhat.com)
- 4.0-4.11 to 4.0-4.12 incremental patch
(12/12/07)
4.0-4.11 - Fix for task-gathering to handle the 2.6.24 pid_namespace-related
changes to the kernel pid_hash array. Without the patch, the crash
session fails during initialization with the message "crash: cannot
gather a stable task list via pid_hash (500 retries)".
(anderson@redhat.com)
- Fix for "kmem -f <address>" and "kmem <address>" commands on
x86 kernels, which may incorrectly indicate that the address is in
the kernel's free page list. Without this patch, if the address
argument is a physical address over 4GB, or a page struct address
referencing a physical address over 4GB, it is possible that the
address would incorrectly be shown as being in the kernel's free
page list. (anderson@redhat.com)
- Fix for x86 "bt" command for active tasks in Egenera dumpfiles
based upon LKCD version 7. Without the patch, the starting points
for the active task backtraces were erroneous.
(anderson@redhat.com)
- Fix for "kmem -S" error message if a slab object is found in both
a per-cpu list and on a slab's global free list. Without the patch,
the object address and cpu number values are flip-flopped in the
error message. (bob.montgomery@hp.com)
- 4.0-4.10 to 4.0-4.11 incremental patch
(12/6/07)
4.0-4.10 - Fix a regression introduced in 4.0-4.9 that causes the "kmem -p"
command to fail in SPARSEMEM kernels that that have the struct
page.index member embedded in an anonymous union, which occurred
when the CONFIG_SLUB-related modifications were made to the page
struct in 2.6.22. Without the patch, "kmem -p" fails with the error
message "kmem: invalid structure member offset: page_index".
(anderson@redhat.com)
- 4.0-4.9 to 4.0-4.10 incremental patch
(11/21/07)
4.0-4.9 - Fix for the "kmem -p" command in kernels configured with
CONFIG_SPARSEMEM, i.e., not CONFIG_SPARSEMEM_EXTREME. Without
the patch, the page structure address for each physical page
was erroneous. (oomichi@mxs.nes.nec.co.jp)
- Fix for the "kmem -p" command output of MAPPING and INDEX values
on kernels where the mapping and index members of the page structure
are contained within anonymous unions. Without the patch, those
fields may be dashed-out.
(bob.montgomery@hp.com, anderson@redhat.com)
- Fix for the "mod" command to search for module object files in the
/lib/modules/<release>/updates directory tree before looking
in /lib/modules/<release>. (charlotte.richardson@stratus.com)
- Fix for the "waitq" command for 2.6.15-era and later kernels, which
replaced the __wait_queue.task member with the __wait_queue.private
member. Without the patch, the command would fail with the error
message: "waitq: invalid structure member offset: __wait_queue_task".
(atyson@hp.com)
- SIAL interpreter fix for an "operation on 'v1' may be undefined"
warning in sial_exeop(). (bwalle@suse.de)
- Fix for several unpredictable failure modes when attempting
"crash -h [command] > outputfile" from a shell command line.
(anderson@redhat.com)
- Addressed compiler warnings generated by extensions/echo.c and
extensions/dminfo.c. (bwalle@suse.de, anderson@redhat.com)
- Addressed compiler warnings generated by lkcd_common.c, lkcd_v8.c
and symbols.c when using:
-O2 -fmessage-length=0 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector
-fno-builtin-memset -fno-strict-aliasing
(bwalle@suse.de)
- Fix for "kmem -p" on i386 CONFIG_SPARSEMEM kernels with greater than
4GB of memory. Without the patch, the physical address value wraps
back to zero after physical page ffff0000.
(oomichi@mxs.nes.nec.co.jp)
- Fix to redirect SIAL script command output to pipes, files, etc., in
the same manner as native crash commands.
(Robert.Denman@teradata.com, anderson@redhat.com)
- Fix for ppc64 kernels with 64K pages whose PTE_RPN_SHIFT has changed
from 32 to 30. Without the patch, an initialization-time warning
message "WARNING: cannot access vmalloc'd module memory" would occur,
the "mod" command would fail with the same message, and "kmem -s"
failures could occur when attempting to read a kmem slab cache name
string. Translations and reads of vmalloc'd kernel virtual addresses
and user virtual addresses would appear to work, but bogus data was
returned because the resultant physical address that was read was
incorrect. (anderson@redhat.com)
- Fix for "kmem -s" if a slab cache whose name string cannot be read
is encountered. Without the patch, a fatal error message would be
displayed and the command aborted. With this patch, a non-fatal
warning message is displayed, and the cache name is indicated as
"(unknown)". (anderson@redhat.com)
- Fix for x86-64 SPARSEMEM kernels with CONFIG_NUMA off. Without the
patch, the crash session fails during initialization with the message
"crash: invalid structure member offset: pglist_data_node_mem_map".
(sachinp@in.ibm.com)
- Fix to use the ia64 physical start address from the LKCD dump header
instead of the default value. This was reported as bug on an SGI
machine. (bwalle@suse.de)
- For s390[x] kernels the page table allocation method will be changed
such that instead of 3 levels, it will be now possible to allocate 4
levels. The current implementation of the page table walk functions
in the crash utility makes assumptions on how the page tables are
allocated by the kernel, e.g. 3 levels are hard coded. This patch
changes that, and the page table walk is done only according to the
s390 architecture without assumptions on the implementation in the
kernel. (holszheu@linux.venet.ibm.com)
- Fix for LKCD dumpfile access failures that abort() the crash session
after displaying an error message indicating a problem with physical
memory zones in the dumpfile. Without the patch, the crash session
would end immediately after displaying an error message of the sort:
"conflicting page: zone 0, page 0: 0, 177160130 != 65536". That
error message will now only be displayed if the crash debug mode is 1
or more, a readmem() "seek error" will be displayed instead, and the
session will return to the "crash>" prompt. (anderson@redhat.com)
- 4.0-4.8 to 4.0-4.9 incremental patch
(11/20/07)
4.0-4.8 - Implemented support for kernels configured with CONFIG_SLUB, which
completely replaces the venerable "kernel/slab.c" with the new
"kernel/slub.c" kmalloc() slab subsystem. Accordingly, the
"kmem -s [address]", "kmem -S [address]", and "kmem <address>"
commands will display slab-related information in a similar manner
to what they currently do, with additional per-node information.
It should be noted that, due to slub.c's design, the verbose
"kmem -S" output will be pared down slightly to not display the
list of all "full" slabs unless the proper kernel slub debugging
has been turned on. However, given a address of an object from a
full slab page, or of the full slab page itself, that address
will then be traced back to its original slab cache and its data
displayed. (anderson@redhat.com)
- Change for support of LKCD dumpfile version 8 and later to determine
the backtrace starting registers from the dumpfile header. Increase
(maximum) NR_CPUS for ia64 to 4096.
(bwalle@suse.de)
- The SIAL interpreter extension module has been updated to support
the ia64, ppc64, s390 and s390x architectures. Several fixes have
been applied, and three new debug commands, sdebug, sclass and sname
have been added. (lucchouina@yahoo.com)
- Fixed a bug in the CONFIG_SPARSEMEM patch (contributed in 4.0-3.22)
in which a static pointer variable was initializing itself with a
buffer that was returned from a command-time-only GETBUF() call,
instead of using malloc(). It would then continue to use the buffer,
trampling on the buffer contents set up by whatever command that
subsequently allocated the buffer. I only caught this during the
CONFIG_SLUB development, so I have no examples (if any) of how this
would have ever manifested itself in a crash command error.
(anderson@redhat.com)
- Fixed the "mach" command in CONFIG_SLUB kernels which would abort
with the error message: "mach: cannot resolve cache_cache" when
trying to determine the value for the L1 CACHE SIZE display. Since
the generic manner of determining the cache size no longer worked
correctly anyway, the L1 CACHE SIZE display has been removed.
(anderson@redhat.com)
- Fix for missing NODE header in NUMA "kmem -f" output.
(anderson@redhat.com)
- Fix for the chronology of the contents of the kernel message buffer
output by the "log" command. (atyson@hp.com)
- Display a WARNING message if a PT_LOAD segment in an ELF-style
dumpfile advertises a memory segment that would go beyond the end
of the dumpfile. (bwalle@suse.de, anderson@redhat.com)
- 4.0-4.7 to 4.0-4.8 incremental patch
(10/30/07)
4.0-4.7 - Incorporation of Luc Chouinard's SIAL interpreter (Simple Image
Access Language) as a crash extension module. When loaded with
the "extend" command, the sial.so module provides three commands,
"load" to load a SIAL script, "unload" to unload it, and "edit",
which unloads the script, brings up an $EDITOR-based edit session
of the script, and then loads it again. Also, when the sial.so
module is loaded, it will automatically load any SIAL scripts
found in the /usr/share/sial/crash or $HOME/.sial directories.
Therefore, by putting "extend <path-to>/sial.so" in either
./.crashrc or $HOME/.crashrc, all desired SIAL scripts may be
loaded on a particular machine in a hands-off manner. For details,
consult the README and README.sial files in the extensions/libsial
subdirectory. (lucchouina@yahoo.com)
- Removed hardwired-dependencies in the top-level and extensions
subdirectory Makefiles for building extension modules. Now it is
possible to copy an extension module's .c file into the extensions
subdirectory, and enter "make extensions" from the top-level to build
it. If the build of the module requires special handling, a .mk
makefile with the same prefix as the .c file may be provided, and
and it will be automatically used to build it.
(jmoyer@redhat.com, anderson@redhat.com)
- When a 32-bit x86 xenU guest is run on an x86_64 dom0 host, the
new-style xen ELF format dumpfile contains an ELF header with an
e_machine type of EM_X86_64 (instead of EM_386). This was getting
gets rejected with the error message "crash: vmcore: not a supported
file format". The fix simply accepts the e_machine type mismatch,
since the new-style ELF format dumpfiles are 64-bit by default.
(anderson@redhat.com)
- Enhanced the "kmem <address>" option to also search for task_struct
and kernel stack addresses, and report them with the "set" output.
Also, fix for when "kmem <vmalloc-address>" was entered, the header
for the mem_map data was not displayed. (anderson@redhat.com)
- Fix for determining starting rip/rsp backtrace hooks for the panic
task in x86_64 xen dom0 kdumps; newer kernels have replaced the
call to "xen_machine_kexec" with "machine_kexec", and without this
patch may display back-traces with missing frames. Also on x86_64
non-xen kdump panic task backtraces, it is possible that the wrong
stack instance of "crash_kexec" is used as the starting hook, which
may also lead to missing frames. (anderson@redhat.com)
- Fix for ia64 LKCD dumpfiles where it is not possible to read the task
structure of the task that follows a task which is in the task address
"fixup list", and zeroes are returned instead. (atyson@hp.com)
- Fix for potential "mod -[sS]" failures with modules whose object
files contain an unusually large number of sections; module
loading attempts may issue a "<segmentation violation in gdb>"
message followed by the error message: "mod: [module name]: gdb
add-symbol-file command failed".
(carl.hsieh@teradata.com, anderson@redhat.com)
- Fix to prevent dumpfile reads beyond EOF when reading new (optimized)
xen ELF core xendumps. Without the patch, error messages of the sort:
"crash: cannot read index page [number]" may occur during session
initialization, with unpredictable run-time results.
(yamahata@valinux.co.jp)
- In x86_xen_kdump_p2m_create(), the same variable was being used as
the for-loop index in both an outer and an embedded inner for-loop.
As a result, if debug level was equal to or larger than 7, the outer
for-loop was repeated only once. (nishimura@mxp.nes.nec.co.jp)
- 4.0-4.6 to 4.0-4.7 incremental patch
(9/25/07)
4.0-4.6 - Also released as:
4.0-4.6.1 - RHEL5.1 errata version (beta)
4.0-4.6.2 - Fedora Rawhide (devel branch)
- Implemented the "runq" command for 2.6.20 and later kernels that have
replaced the O(1) scheduler with the CFS scheduler. If the kernel
was configured to use CFS, the command will display the tasks queued
in each cpu's RT and CFS runqueues. (anderson@redhat.com)
- The initial support put in place for the usage of "kerntypes"
debuginfo files only recognized files created by the LKCD
"dwarfextract" utility run against a -g built vmlinux kernel.
This version adds a new "-k" command line option that allows the
usage of standard -g compiled LKCD Kerntypes files.
(holzheu@linux.vnet.ibm.com)
- Update of "xencrash" support to properly handle dom0/hypervisor
kdumps taken under xen version 3.1 in addition to those taken under
xen 3.0.x. Without this patch, the following warning message
would be displayed during initialization of a xen-syms hypervisor
session: "WARNING: unsupported elf note format". Fixes x86 "bt"
command segmentation violation when running against a xen-syms
hypervisor. Fixes x86_64 session initialization failure when running
against a xen-syms hypervisor, which would display the error
message "crash: invalid structure member offset: tss_struct_rsp0".
(oda@valinux.co.jp)
- 4.0-4.5 to 4.0-4.6 incremental patch
(8/27/07)
4.0-4.5 - Addresses FC7/upstream x86 kernels that have been configured such
that the vmlinux symbol values do not match their relocated values
when loaded. If CONFIG_PHYSICAL_START contains a value that is
greater then CONFIG_PHYSICAL_ALIGN, then this mismatch occurs.
Since the crash utility and its embedded gdb have always expected
that the compiled-in kernel symbol addresses are "real", the virtual
to physical translation fails, leading to an initialization-time
failure with the message: "crash: vmlinux and /dev/crash do not
match!" (/dev/mem or the dumpfile name may replace /dev/crash).
To deal with this issue, there are several alternatives:
1) Configure the kernel with CONFIG_PHYSICAL_START less than
or equal to CONFIG_PHYSICAL_ALIGN. Having done that, there
is no problem; the resultant vmlinux file will be loaded at
the address for which it was compiled, which has always
been the case.
2) Since /proc/kallsyms uses the same format as a System.map file,
and since it reflects the relocated symbol addresses, it
can be placed on the crash command line as if it were
a System.map file. (Note that the System.map file created
by these relocated kernels contains the same "wrong" symbol
values as the vmlinux file from which it was created.)
3) On a live system that has /proc/kallsyms (i.e., the kernel was
configured with CONFIG_KALLSYMS), this version of the crash
utility will replace/patch the vmlinux symbol values with those
seen in /proc/kallsyms. The relocation value will be displayed
as a WARNING message during initialization.
4) On a dumpfile, the relocation will not be performed automatically
as on a live system. It will require the addition of the
/proc/kallsyms on the command line, or if run on a different
host, a copy of the crashed system's /proc/kallsyms may be
used.
5) Alternatively on a dumpfile, a new command line option has been
created to specify the relocation amount. For example, if a
kernel was configured with a CONFIG_PHYSICAL_START value of 16MB
and a CONFIG_PHYSICAL_ALIGN of 4MB, that results in a relocation
of 12MB. To specify that, enter "crash --reloc=12m ..." on the
command line. (Recall that if crash is run on the live system,
a WARNING message will specify the relocation amount.)
Using /proc/kallsyms or a --reloc=[size] as a command line argument
is similar to using a System.map file, in that it results in the loss
of the use of line number debug data. (anderson@redhat.com)
- Fix for x86 2.6.22 kernel initialization-time failure indicating:
"crash: invalid size request: 0 type: __per_cpu_offset"
(oomichi@mxs.nes.nec.co.jp)
- Fix to recognize the 2.6.22 kernel's replacement of kmalloc slab
subsystem from the "./mm/slab.c" file to CONFIG_SLUB-configured
kernels that use the infrastructure in "./mm/slub.c". Without this
fix, crash sessions would fail during initialization with the message
"crash: invalid structure member offset: kmem_cache_s_c_num".
(anderson@redhat.com)
- Cliff Wickman sent an additional patch for the LKCD kerntypes
support he introduced in version 4.0-4.4, which addresses this
message that is seen during initialization on 2.6.22 kernels:
"WARNING: cannot determine pgdat list for this kernel/architecture".
(cpw@sgi.com)
- NOTE: The CONFIG_SLUB change in the 2.6.22 kernel will require a
significant update in the crash utility in order for "kmem -[sS]"
options to work again.
- NOTE: 2.6.20 and later kernels may have replaced the O(1) scheduler
with the new CFS scheduler. If configured to use CFS, the "runq"
command fails, which will require a crash utility update to recognize
and display the contents of each cpu's RT and CFS run queue.
- 4.0-4.4 to 4.0-4.5 incremental patch
(7/27/07)
4.0-4.4 - Fix for kernels in which the irq_desc_t typedef is not included in
the vmlinux debuginfo data, by using the 2.6-era struct irq_desc.
Without the patch, the "irq" command fails with the error message,
"irq: cannot determine size of irq_desc_t". (hugh@mimosa.com)
- Implemented new "irq -u" option that displays only in-use IRQs, now
that there can be several thousand entries in the irq_desc[] array.
(anderson@redhat.com)
- Prevent occasional 99% cpu usage waiting for the built-in less
command to complete. (anderson@redhat.com)
- Implemented support for the use of "kerntypes" debuginfo files that
are created by the LKCD "dwarfextract" utility, as an alternative to
the use of the vmlinux file. This requires the use of the matching
System.map file, as in this example:
# crash kerntypes System.map [vmcore]
This capability was written by Cliff Wickman of SGI, and he has
generously offered to maintain its functionality. (cpw@sgi.com)
- Fixes, code improvement and cleanup for "crash -h [command]".
(hugh@mimosa.com)
- The output of command data exceeding a terminal page-size has been
traditionally fed by default to "/usr/bin/less -E -X" with a prompt;
if the /usr/bin/less command was not available on the host system,
output would be fed to "/bin/more" instead. Scrolling can be turned
off with "set scroll off" or the built-in alias "sf", and back on
with "set scroll on" or the built-in alias "sn". This release
allows the user to specify an alternative scrolling program by
creating a CRASHPAGER environment variable, which be used by default
if it exists. Also, the "set scroll [arg]" internal variable setting
command, which until now accepted "on" and "off" as arguments, now
accepts "less", "more" and "CRASHPAGER" as alternative arguments,
both during runtime, or in .crashrc files. Also, new crash command
line arguments have also been added to override the default and/or
.crashrc settings: --more, --less, and --CRASHPAGER. Lastly, the
output of the "crash -h [command]" will also use the relevant scroll
command selection. (anderson@redhat.com)
- Updated crash(8) man page. (hugh@mimosa.com, anderson@redhat.com)
- 4.0-4.3 to 4.0-4.4 incremental patch
(7/20/07)
4.0-4.3 - Tentatively scheduled as the baseline version for RHEL4.6 and RHEL5.1
crash utility errata releases:
4.0-4.3.0 - RHEL4.6 errata version
4.0-4.3.1 - RHEL5.1 errata version
- Fix for "kmem -f" command on 2.6.17 and later CONFIG_DISCONTIGMEM
kernels. Without the patch, the command would fail with the error
message "kmem: cannot determine zone mem_map: TBD".
(troy.heber@hp.com)
- Fix for segmentation violation when using the wrong vmlinux file
command line argument on a live system on either the x86_64 or
ia64 architectures. If attempted with this version, the normal
"WARNING: vmlinux and /proc/version do not match!" message will
be followed by an additional warning message that displays the
Linux version number from /proc/version, and then the final message:
"crash: please use the vmlinux file for that kernel version, or
try using the System.map for that kernel version as an additional
argument." (anderson@redhat.com)
- For all 4 types of input-file processing:
1) $HOME/.crashrc
2) ./.crashrc
3) "crash -i input"
4) session-time "< input"
If a command in the input file encounters a FATAL error, the
remainder of the commands will be executed. Until now, if any
command in the input file caused a FATAL error, the processing
of the remainder of the commands would be aborted.
(anderson@redhat.com)
- 4.0-4.2 to 4.0-4.3 incremental patch
(6/22/07)
4.0-4.2 - Fix for support of 2.6.22 kernels, which have changed the name
of the task_struct's "thread_info" member to the "stack" member.
This would cause the crash session to fail during initialization.
(troy.heber@hp.com, anderson@redhat.com)
- Fix to account for the number of pgdata nodes being less than the
number of cpus. Without the patch, the crash session would fail
during initialization with the error message: "crash: numnodes out
of sync with pgdat_list?" (sharyath@in.ibm.com)
- Implemented support for ia64 dom0/HV kdump dumpfile support, taken
either via the traditional kdump process, or simulated via the
Fujitsu "sadump" facility. (oda@valinux.co.jp)
- Created a "--no_panic" command line option to avoid the panic-task
search during initialization. (anderson@redhat.com)
- Implmented a new "ps -r" option to display resource limits (ulimits):
crash> ps -r 20618
PID: 20618 TASK: 1003cb82030 CPU: 1 COMMAND: "bash"
RLIMIT CURRENT MAXIMUM
CPU (unlimited) (unlimited)
FSIZE (unlimited) (unlimited)
DATA (unlimited) (unlimited)
STACK 10485760 (unlimited)
CORE 0 (unlimited)
RSS (unlimited) (unlimited)
NPROC 8180 8180
NOFILE 1024 1024
MEMLOCK 32768 32768
AS (unlimited) (unlimited)
LOCKS (unlimited) (unlimited)
SIGPENDING 1024 1024
MSGQUEUE 819200 819200
(anderson@redhat.com)
- Implement support for the registration of CLEANUP extension commands
that do not show up in help menu, but get called by restore_sanity().
Extension modules may also register HIDDEN_COMMAND functions; and the
"help -e" debug output has been enhanced. (anderson@redhat.com)
- Implemented a new symbol_value_module() primitive, primarily for use
by extension modules to quickly access the address of a module symbol
in cases where a name-clash may exist between the base kernel and/or
other modules. (anderson@redhat.com)
- The crash-4.0-4.2.src.rpm package will create an additional package
named crash-devel-4.0-4.2.i386.rpm, which is for use by extension
modules. The -devel package installs the top-level "defs.h" file in
"/usr/include/crash/defs.h". (anderson@redhat.com)
- 4.0-4.1 to 4.0-4.2 incremental patch
(6/04/07)
4.0-4.1 - Implemented dependable backtraces for the x86_64 architecture. (!!!)
This feature builds upon the current "low_budget" backtrace function,
and also required the fix for the BUG()/ud2a disassembly problem
addressed in 4.0-3.22. It does not require kernel unwind support,
but rather it calculates function framesizes by disassembling the
code from the beginning of the function to the point where it calls
the next function, parsing for add or sub instructions on the rsp,
and for push and pop instructions, thereby determining the framesize
of the function at the point of the call. This is similar to what is
done for x86, but requires far less hackery. You will notice a slight
hitch the first time a "bt" is done on a task, but for each text
return address in any backtrace, its framesize is cached for all
subsequent instances. It also accounts for backtrace text return
addresses from the .text.lock section, by appending "(via function)"
to the end of the frame line. Also, because it layers on top of the
current backtrace code, it does not compromise the capability of
switching between the process, IRQ, and exception stacks. That all
being said, 100% accuracy cannot be guaranteed. But for the ~30
sample dumpfiles I keep around for x86_64 testing, I cannot find any
obviously invalid backtraces. However, if there is any doubt, the
"bt -o" option will perform backtraces using the "old" manner; and
"bt -O" will force the old manner to always be used. Of course the
"bt -t" and "bt -T" options are still available. It's interesting to
redirect the output of "foreach bt" to a file using this version, and
then compare it with the output from an older version.
(anderson@redhat.com)
- Fix for s390 and s390x backtrace commands to recognize the kernel
structure name change from "runqueue" to "rq".
(holzheu@linux.vnet.ibm.com)
- Merged fourth round of "xencrash" patches, which allows a crash
session to alternatively be brought up against the xen-syms
binary instead of a vmlinux kernel. This patch enhances the
"doms" command display contents, and adds support to access the
ia64 frame table virtual address space so that the page_info table
can be accessed. (oda@valinux.co.jp)
- 4.0-3.22 to 4.0-4.1 incremental patch
(4/27/07)
4.0-3.22 - In kernel version 2.6.20 a "__bug_table" section has been added
to the kernel for x86 and x86_64, which contains the encoding for
the filename and line number information associated with each
instance of a kernel BUG(). Prior to that, x86 and x86_64 kernels
may have contained the filename/line-number encoding in the bytes
following the BUG()'s "ud2a" instruction. When disassembled, the
output would display a series of nonsensical instructions, or perhaps
one or more "(bad)" instruction lines, before eventually getting
back in sync with the actual instruction stream. Whether the
encoded bytes were included depends upon the kernel version,
whether CONFIG_DEBUG_BUGVERBOSE was configured, or whether an
"#if 1" surrounding the BUG() definition was manually changed.
This version of crash determines whether the encoded bytes exist,
and if so, the embedded gdb disassembler has been modified to
skip over those bytes, resulting in correct "dis" command output.
If necessary, a "dis -b" option has been added to override the
pre-calculated encoded byte count value. (anderson@redhat.com)
- Fix for the x86 backtrace code to also recognize the encoded
filename and line number information potentially following
"ud2a" instructions generated by kernel BUG() calls. In order
to determine the framesize of a function, the backtrace code
does its own text disassembly to count instances of push, pop,
and stack register increments/decrements. Without this patch,
the framesize calculation may either be too small or too large,
depending upon the contents of the encoded data following the
BUG()'s ud2a instruction. Therefore, it is possible that one or
more bogus frames are selected and displayed, and/or one or more
legitimate frames are skipped over. For example, when it affected
the framesize calculation of schedule(), backtraces of all non-active
tasks ending up in schedule() would be invalid. Here's an example in
which the schedule() framesize was miscalulated:
PID: 1292 TASK: ed78a000 CPU: 0 COMMAND: "setroubleshootd"
#0 [c07fdba8] schedule at c05f370e
#1 [c07fdcb4] __journal_file_buffer at ee05126d
#2 [c07fdcd8] __journal_file_buffer at ee05126d
#3 [c07fdd08] ext3_mark_iloc_dirty at ee08837d
#4 [c07fdd38] journal_dirty_metadata at ee052a13
#5 [c07fdd80] __find_get_block at c0463f59
#6 [c07fddac] __find_get_block at c0463f59
#7 [c07fddf0] find_get_page at c0444294
#8 [c07fddfc] filemap_nopage at c0446cf5
#9 [c07fde6c] find_extend_vma at c0454132
#10 [c07fde7c] get_futex_key at c042f9f6
#11 [c07fde94] futex_wake at c042fe2a
#12 [c07fdeb8] do_futex at c0430a19
#13 [c07fdfac] sys_poll at c047254b
#14 [c07fdfb8] system_call at c0404cf8
EAX: ffffffda EBX: 09f3da18 ECX: 00000002 EDX: 00000064
DS: 007b ESI: 00000064 ES: 007b EDI: 00342ff4
SS: 007b ESP: bfe76d04 EBP: bfe76d18
CS: 0073 EIP: 0094a402 ERR: 000000a8 EFLAGS: 00200246
With the fix, it looks like this:
PID: 1292 TASK: ed78a000 CPU: 0 COMMAND: "setroubleshootd"
#0 [c07fdba8] schedule at c05f370e
#1 [c07fdc0c] schedule_timeout at c05f3e7c
#2 [c07fdc30] do_sys_poll at c047243e
#3 [c07fdfac] sys_poll at c047254b
#4 [c07fdfb8] system_call at c0404cf8
EAX: ffffffda EBX: 09f3da18 ECX: 00000002 EDX: 00000064
DS: 007b ESI: 00000064 ES: 007b EDI: 00342ff4
SS: 007b ESP: bfe76d04 EBP: bfe76d18
CS: 0073 EIP: 0094a402 ERR: 000000a8 EFLAGS: 00200246
In the example above, the schedule() framesize was miscalculated
because the post-ud2a text contained the filename pointer address
c060fe0b, and the "60" was decoded as a "pusha" instruction; that
occurred twice, each time incrementing the framesize by 32 bytes.
(anderson@redhat.com)
- Added preparations for an upcoming version update to kdump's
associated makedumpfile utility, which will return an error if a
read attempt of a page that has been explicitly excluded is made.
Until now, a zero-filled page was returned. To maintain the
current behavior of returning a zero-filled page when accessing
an excluded page, three options are available:
1) use the "--zero_excluded" crash command line option.
2) during runtime, enter "set zero_excluded on".
3) enter "set zero_excluded on" in your .crashrc file.
(anderson@redhat.com, oomichi@mxs.nes.nec.co.jp, bob.montgomery@hp.com)
- Implemented "help -n" debug output function for compressed diskdump
and compressed kdump dumpfiles. As is done for the other dumpfile
formats, the core file's header information along with any other
run-time dumpfile data is displayed. (anderson@redhat.com)
- If the page-exclusion "dump_level" of a compressed diskdump, a
compressed kdump, or an ELF diskdump dumpfile exists and can be
determined, its value and bitmask translation will be displayed as
part of the "help -n" dumpfile debug output. Also, as has been done
with partial ELF diskdumps, if a compressed diskdump or compressed
kdump can be confirmed as a partial dump, the "[PARTIAL DUMP]"
indicator will follow the dumpfile name during initialization and by
the "sys" command. (anderson@redhat.com, oomichi@mxs.nes.nec.co.jp,
indou.takao@jp.fujitsu.com, akiyama.nobuyuk@jp.fujitsu.com)
- Support for xendumps of fully-virtualized x86_64 relocatable
kernels. Without the patch, the physical base address was not
being determined, and the session would fail during initialization
with the error message: " crash: vmlinux and core do not match!"
(anderson@redhat.com)
- Fix for 4.0-3.21 "BOOKE" ppc.c patch, which failed to compile.
(antipov@ru.mvista.com)
- 4.0-3.21 to 4.0-3.22 incremental patch
(04/10/07)
4.0-3.21 - Introduced support for upstream xensource ELF format dumpfiles,
which will replace the current xendump format in xen 3.0.5. The
new xen format uses ELF in a non-standard manner such that memory
contents are defined in section headers instead of the traditional
manner of using program headers. Testing has been completed on
paravirtualized x86, x86 PAE, x86_64 and ia64 dumpfiles. Fully-
virtualized dumpfiles have not been tested. (anderson@redhat.com)
- A number of "xencrash" (where the session is run against a xen-syms
binary) fixes have been applied:
1) "bt" did not switch from the ia64 MCA stack to the vcpu stack.
2) "bt" caused an infinite loop if ar_bspstore contained an illegal
value.
3) "bt" shows unnecessary unwind warning message. (ia64)
4) "man log" caused crash to fail with a segmentation violation.
5) "man log" did not have an example.
(oda@valinux.co.jp)
- Fix for "vtop" on x86 PAE kernels, which could abort upon reaching
the PTE translation section, showing the error message: "vtop:
cannot determine the swap location". (anderson@redhat.com)
- Fix for "vm -p" or "vtop" on 2.6 x86 PAE kernels, which could show
incorrect swap offsets, because the swap type/offset encoding was
moved to the high word of the 64-bit PTE. (anderson@redhat.com)
- Fix for "vm -p" on x86_64 kernels when a PTE referenced a swap
location, it would show "(not mapped)" instead of the swap location.
(anderson@redhat.com)
- In current 2.6 kernels, it is now possible to recognize ppc BOOKE
processors, which is the current default in crash. If the processor
is confirmed to not be BOOKE, then page table translation is done
differently. (antipov@ru.mvista.com)
- Fix for live system analysis of Ubuntu kernels due to a mismatch
between /proc/version and the linux_banner string. This was due
to an appendage to the linux_banner string in Ubuntu kernels.
(asid@hp.com)
- Fix for 2.6.21 kernels that fail during initialization with the
message: "crash: invalid (optional) structure member offsets:
zone_struct_free_pages or zone_free_pages". This was due to the
removal of the zone struct's "free_pages" member; instead the
zone struct's "vm_stat[NR_FREE_PAGES]" value is used.
(anderson@redhat.com)
- 4.0-3.20 to 4.0-3.21 incremental patch
(03/16/07)
4.0-3.20 - Merged third round of "xencrash" patches, which allows a crash
session to alternatively be brought up against the xen-syms
binary instead of a vmlinux kernel. This update introduces
support for ia64. (oda@valinux.co.jp)
- Verified support of live system analysis of ia64 xen kernels, and
removed unnecessary EFI memory verification warning message during
their initialization. (anderson@redhat.com)
- Added gdb's "shell" command to the prohibited gdb command list, and
updated the "help output" page to describe shell escape usage.
(anderson@redhat.com)
- Fix for the x86 "bt" command for the 2.6.20 kernel, which has added
the "xgs" field to the pt_regs structure. Without this patch, the
exception frame dump in "bt" would show invalid contents for several
registers; the fix also shows the GS register contents.
(anderson@redhat.com)
- Fix for the "mount" command for the 2.6.20 kernel to recognize the
new "nsproxy" field in the task_struct and the contents of the
nsproxy and mnt_namespace structures, in order to find the root
mount namespace. Without the patch, the command would fail with:
"mount: invalid kernel virtual address: 69 type: first list entry".
(anderson@redhat.com)
- Fix for the "files" command for the 2.6.20 kernel to handle the
removal of the fdtable "max_fdset" member. Without the patch, the
command would fail with: "files: invalid structure member offset:
fdtable_max_fdset". (anderson@redhat.com)
- Fix for the "net -[sS]" command options for the 2.6.20 kernel to
handle the removal of the fdtable "max_fdset" member. Without the
patch, the command would fail with: "net: invalid structure member
offset: fdtable_max_fdset". (anderson@redhat.com)
- Fix for the "vm" command for the 2.6.20 kernel to handle the removal
of the file structure's "f_dentry" member, and its placement inside
the embedded "path" structure. Without the patch the command would
fail with: "vm: invalid structure member offset: file_f_dentry".
(anderson@redhat.com)
- Fix for the "swap" command for the 2.6.20 kernel to handle the removal
of the file structure's "f_vfsmnt" member, and its placement inside
the embedded "path" structure. Without the patch the command would
fail with: "swap: invalid structure member offset: file_f_vfsmnt".
(anderson@redhat.com)
- 4.0-3.19 to 4.0-3.20 incremental patch
(02/21/07)
4.0-3.19 - Fix for support of paravirtual x86 xendumps that were:
1) created on host machines with greater than 4GB of memory, and
2) the active guest task at crash-time had been assigned a page
directory page (cr3) with a machine address greater than 4GB.
If both of the above apply, the crash session would fail with one of
two error messages, either "crash: cannot read/find cr3 page", or
"crash: cannot create xen pfn-to-mfn mapping". (anderson@redhat.com)
- Fix for the "kmem -p [page-struct-address]" command construct, which
would cause a segmentation violation when run on SPARSEMEM kernels.
(anderson@redhat.com)
- Added a new "struct -u" option, which indicates that the subsequent
address argument is a user virtual address in the current context.
This option could be used, for example, if a known kernel data
structure exists at user virtual address in the current context,
or if the debuginfo data of a user program were loaded into the
crash session via the gdb "add-symbol-file" command.
(anderson@redhat.com)
- Added new "rd -f" and "struct -f" options, which indicate that the
subsequent address argument is a dumpfile file offset. These options
could be used, for example, to print a known kernel data structure
that exists in the dumpfile header, or to simply dump data directly
from the dumpfile. (anderson@redhat.com)
- Cosmetic fix to prevent double-printing of "kmem -p" and "kmem -v"
headers when those commands are passed multiple address arguments.
(anderson@redhat.com)
- 4.0-3.18 to 4.0-3.19 incremental patch
(02/07/07)
4.0-3.18 - Enhancement to the "mod" command to expand the number of section
arguments to the internal "add-symbol-file" command issued to gdb to
load the debug data for module objects. On most architectures, this
allows the usage of the command construct "p [module-symbol-name]" to
print out the module data structure in the same way that is done for
kernel proper data structure names. (castor.fu@3pardata.com)
- Two enhancements to significantly speed up the initialization of
crash sessions when running against multi-gigabyte xen kernels or
xendumps. The cache of mfn-to-phys_to_machine_mapping page has been
changed from a single-mfn-to-phys_to_machine_mapping page format to
storing a contiguous-range-of-mfns-to-phys_to_machine_mapping format.
This benefit is primarily seen during the "gathering module symbol
data" phase. The second change simply increases the size of the
pfn-to-xendump-page-offset cache. (anderson@redhat.com)
- Fix for a segmentation violation during the "gathering task table
data" phase of initialization if the thread_info structure of the
runqueue-advertised active task has been freed. This has only ever
been seen in a xendump created by "xm dump-core -L [guest-domain]".
(anderson@redhat.com)
- Cosmetic fix to prepend newlines to messages that happen to be
generated during any of the "please wait" segments of initialization.
(anderson@redhat.com)
- Addressed several compiler warnings when using -D_FORTIFY_SOURCE=2.
Some are in gdb code that is never exercised, others were legitimate
but would require impossible code paths, but one of them could
result in runaway "help -t" output if the kernel was built without
IKCONFIG. (bwalle@suse.de)
- Fix for the s390x "bt -f" command option, which was displaying the
stack as a sequence of 32-bit words which were dumped "backwards",
i.e., at the wrong offset. (krader@us.ibm.com)
- 4.0-3.17 to 4.0-3.18 incremental patch
(02/01/07)
4.0-3.17 - Two fixes for "dev -p" command option:
1) The head entry of the PCI device list was being skipped.
2) For systems with no PCI devices, exit gracefully rather than
failing the command due to the use of an invalid virtual
address.
(rachita@in.ibm.com, anderson@redhat.com)
- Fix to recognize "linux_banner" symbol type change from 'R'
to 'r' in 2.6.20-rc2 kernels. Without the patch, the session
fails during initialization with the error message " WARNING:
invalid linux_banner pointer: 756e694c", and then "crash: vmlinux
and vmcore do not match! (vgoyal@in.ibm.com)
- Fix to recognize "__per_cpu_start" and "__per_cpu_end" symbol
type change from 'A' to 'D' in relocatable kernels. Without
the patch, SMP kernels running on uniprocessor systems may fail
during initialization with the message "crash: cannot resolve
init_task_union". (sachinp@in.ibm.com)
- Fix for the xencrash "dumpinfo -t" command to properly cycle
through the ELF_timeval structures for each cpu.
(anderson@redhat.com)
- Fix for x86_64 backtraces that may end prematurely at either a
stale "schedule" or "schedule_timeout" reference when doing a
"bt" on an active task in a dumpfile. (anderson@redhat.com)
- Fix for a possible empty panic message in 2.6 kernels both during
initialization and when running the "sys" command, because of
the change of the kernel panic() string from "Kernel panic: " to
"Kernel panic -- not syncing: ". If the panic message was not
recognized in another manner, such as by an oops message, by a
kernel BUG message, or sysrq-generated crash, the "PANIC:" status
would be empty. (anderson@redhat.com)
(01/12/07)
4.0-3.16 - Recognize new XC_CORE_MAGIC_HVM xendump magic number, which in turn
introduces support for xendumps of fully-virtualized ia64 kernels.
(oda@valinux.co.jp)
- Recognize an INVALID_MFN marker in the indexed mfn list of a xendump,
and if found, fail the read attempt on the associated pfn.
(oda@valinux.co.jp, anderson@redhat.com)
(12/21/06)
4.0-3.15 - Introduced support for xendumps of fully-virtualized x86 kernels
taken while running on an x86 Xen host (32-bit on 32-bit host).
(anderson@redhat.com)
- Introduced support for xendumps of fully-virtualized x86 kernels
taken while running on an x86_64 Xen host (32-bit on 64-bit host).
(anderson@redhat.com)
- Introduced support for xendumps of fully-virtualized x86_64 kernels.
(anderson@redhat.com)
- Introduced support for xendumps of para-virtualized ia64 kernels.
It should be noted that currently the ia64 Xen kernel does not
lay down a switch_stack for the panic task, so only raw "bt -t"
backtraces can be done on the panic task. (anderson@redhat.com)
- Introduced support for "xm save" dumpfiles of para-virtualized ia64
kernels, which use a completely different format than that used for
x86 and x86_64. (anderson@redhat.com)
- Additional support for the current kexec/kdump patch for Xen:
1) Merged second round of "xencrash" patches, which allows a crash
session to be alternatively brought up against the xen-syms
binary instead of a vmlinux kernel. (oda@valinux.co.jp)
2) Using the xencrash feature above, the pfn_to_mfn_list_list value
of any guest domain that was running when the dom0 or hypervisor
crashed can be determined; that pfn value can in turn be used
as an argument to a new "--p2m_mfn [pfn]" crash command line
option. That will allow a crash session to be run against any
guest domain. Therefore, with a single dom0/hypervisor vmcore,
the following types of crash sessions may be initiated:
$ crash vmlinux-dom0 vmcore
$ crash xen-syms vmcore
$ crash --p2m_mfn [pfn] vmlinux-guest-#1 vmcore
$ crash --p2m_mfn [pfn] vmlinux-guest-#2 vmcore
$ ...
(anderson@redhat.com)
3) Fixed "help -n" debug output to properly display the contents
of the new XEN_ELFNOTE_CRASH_INFO and XEN_ELFNOTE_CRASH_REGS
ELF note types. (anderson@redhat.com)
- Turn off the LKCD dumpfile-access "spinner" when "crash -s" is used.
(castor.fu@3pardata.com)
- Update to MODULES_IN_CWD code segment so that it will work on 2.6
kernels where modules end with ".ko". This requires that kernel.c
is compiled with -DMODULES_IN_CWD. (castor.fu@3pardata.com)
- Support LKCD "map" files in lieu of standard System.map files.
Without this patch, crash would fail with an error message of the
sort: "crash: map.4: not a supported file format". (bwalle@suse.de)
- The ia64 PR_UNALIGN_NOPRINT and PR_FPEMU_NOPRINT prctl commands have
been moved earlier in time, in order to prevent "unaligned access"
messages when accessing ELF header contents. (anderson@redhat.com)
- The dlopen() call used by the "extensions" facility has been changed
to use the RTLD_GLOBAL flag, so that symbols from an extension object
will be visable to subsequently loaded modules. (asid@hp.com)
(12/20/06)
4.0-3.14 - Tentatively scheduled for RHEL5-GA
- Added support for Magnus Damm's latest kexec/kdump patch for Xen.
The ELF header of the vmcore, which is a full memory dump of the
dom0/hypervisor combination, contains a XEN_ELFNOTE_CRASH_INFO note
that contains the pfn_to_mfn_list_list value for dom0, allowing
pfn-to-mfn translations may be made for crash analysis of the dom0
linux kernel. (anderson@redhat.com)
- Added support for recognizing the zero-fill segments in ELF vmcore
files created by the makedumpfile command from kdump /proc/vmcore files.
Without this patch, ELF vmcore files generated by makedumpfile could
only be used by gdb. (anderson@redhat.com)
- Updated the 4.0-3.4 patch that addressed the bogus kernel-/proc/version
mismatch initialization failures using recent s390x vmlinux files that
contain an ASCII character just preceding the Linux version string.
That patch fixed the problem when the vmlinux file name was placed on
the crash command line; this version also fixes it when "crash" is
entered alone on the command line, and it has to search for the vmlinux
file. (anderson@redhat.com)
(12/01/06)
4.0-3.13 - Adapted the "xencrash-0.2" patch described here:
https://www.redhat.com/archives/crash-utility/2006-November/msg00036.html
This functionality consists of three inter-dependent parts, all of
while are still under development:
1) the kexec-tools user package
2) the kdump kernel patch for Xen
3) the crash utility
The end result will be a single crash binary that can be used with
either the Xen dom0 vmlinux kernel, or with the xen-syms hypervisor binary,
with the common vmcore created when either of those two entities crash.
(oda@valinux.co.jp, anderson@redhat.com)
- Fixed the initialization-time, and "sys" command, displays of the system
memory size when memory nodes have holes. Without this patch, more memory
than what is installed may be displayed. (anderson@redhat.com)
(11/27/06)
4.0-3.12 - For 2.6.14 and later ia64 kdumps, taken either as a result of the
INIT switch, or when an MCA exception has occurred, several problems
needed to be addressed. First, the "pseudo-task" that handles the
kdump operation due to an INIT or MCA was not being recognized as
the "panic" task. Secondly, the backtraces of the per-cpu INIT
or MCA handling pseudo-tasks only went back as far as their entry
onto their own per-cpu stacks, and did not show the backtrace of
the task that was running on that cpu when the INIT or MCA event
occurred. This version recognizes the pseudo-task that handles the
kdump operation; and for each cpu, the active tasks' backtraces now
also show a transition back to the task that was running on that cpu
when the INIT or MCA event occurred. (j-nomura@ce.jp.nec.com)
- To address the need to display per-cpu variables, the "p"
command has been modified to recognize "per_cpu__xxx" arguments
when the kernel is SMP, in order to prevent the attempt to display
the contents of a variable whose symbol value does not represent
the actual location of its data. In that case, the data type of
the per-cpu variable will be displayed, followed by the addresses
of each per-cpu instance. Given that information, a proper command
can be utilized in order to display the data. For example, to look
at the per-cpu buffer_head accounting for cpu 2:
crash> p per_cpu__bh_accounting
PER-CPU DATA TYPE:
struct bh_accounting per_cpu__bh_accounting;
PER-CPU ADDRESSES:
[0]: c5405a80
[1]: c540da80
[2]: c5415a80
[3]: c541da80
crash> bh_accounting c5415a80
struct bh_accounting {
nr = 434,
ratelimit = 2216
}
Note that "p" on the first command line above is optional, because
whenever a data variable is entered alone, crash will recognize it
as such, and pass it to the "p" command by default. I had thought
of putting this functionality into the "struct" command, but many
of the per-cpu variables are pointers, arrays, etc.. So for the
non-structure cases, the "rd" command would be more appropriate,
or alternatively a cobbled-together gdb print command.
(anderson@redhat.com)
- A consolidated cleanup and minor fixes patch has been applied to
the experimental x86_64 dwarf CFI unwind facility.
(rachita@in.ibm.com)
- Also related to the experimental x86_64 dwarf CFI unwind facility,
fixed a problem where if a "set unwind on" was done, and followed
by a subsequent "set unwind off", then the "bt" output could either
cause a segmentation violation, or display backtrace data that was
different from the original. (anderson@redhat.com)
(11/15/06)
4.0-3.11 - Tentatively scheduled for RHEL5-B2
- Updated fix for 2.6.18 x86_64 kernels to address the change in
the IRQ-stack-to-process stack linkage; the fix introduced in
4.0-3.9 could fail depending upon the crash session's display
window size, due to a behind-the-scenes gdb line-wrap of text
disassembly. (anderson@redhat.com)
(11/09/06)
4.0-3.10 - [Red Hat internal -- identical to 4.0-3.9]
4.0-3.9 - Tentatively scheduled as errata version for RHEL4-U5.
- The current 2.6.18 x86_64 kernel has changed the IRQ-stack-to-
process-stack linkage, where until now the link value was a pointer
to the exception frame on the process stack, but has been changed
to point to a location on the process stack above the exception
frame. Because of that, after displaying the trace data from the
IRQ stack, "bt" would then display an invalid exception frame,
which was reported as a "possibly bogus exception frame".
(anderson@redhat.com)
- Also in x86_64 kernels, fix for the "bt" command. When the backtrace
started on the NMI exception stack, it was displaying the correct
exception frame data, but was erroneously reporting that it was a
"possibly bogus exception frame". (anderson@redhat.com)
- And again in x86_64 kernels, fix for the "bt" command. When making
the transition from the IRQ stack back to the process stack, when
the IRQ stack entry was made via the relatively new "call_softirq"
entry point. In that case, there is no exception frame on the
process stack, because it's essentially just a cross-stack call
from do_softirq(). However, a bogus exception frame was being
displayed, along with a "possibly bogus exception frame" message;
and if the RIP value in the truly bogus exception frame happened
to fall in the user virtual address range, the remainder of the
process stack trace was not displayed at all. (anderson@redhat.com)
- Fix for 2.6.18-era ia64 DISCONTIGMEM kernels, which would fail
during initialization with the error message: "crash: invalid
(optional) structure member offsets: pglist_data_node_next or
pglist_data_pgdat_next". (anderson@redhat.com)
- Adapted Olivier Daudel's nifty enhancement to the "struct" command,
which allows the single "struct.member" argument to optionally be
expressed in a "struct.member[,member,member] format, in order to
display multiple members of a given structure. This also applies to
the "union" and "*" commands, as all three functions have now been
combined into one behind the scenes. Fixed the display for applying
a minus count, and given that it opened up a the door to a number of
entry errors, I also added additional error-catching/handling to avoid
the display of incorrect structure data.
(olivier.daudel@u-paris10.fr, anderson@redhat.com)
- Fixed three sources of potential segmentation violations when using
the "bt" command when the experimental dwarf CFI unwind backtrace
facility was turned on. (anderson@redhat.com)
- Added a new machdep_init(POST_VM) call, which is currently only being
used by the x86_64 architecture; it calls init_unwind_table(), which
has to be done after vm_init() in order to access the unwind tables
of kernel modules. (anderson@redhat.com)
- Prevent ia64 "floating-point assist fault" and "unaligned access"
console messages by issuing PR_FPEMU_NOPRINT and PR_UNALIGN_NOPRINT
prctl() settings. (anderson@redhat.com)
(11/02/06)
4.0-3.8 - Fix for the "irq" command when run on 2.6.17 and later kernels, which
replaced the hw_interrupt_type structure with the irq_chip structure.
Without the patch, the command would fail with the error message
"irq: invalid structure member offset: irq_desc_t_handler".
(rachita@in.ibm.com)
- Phased in the first stage of support for the use of dwarf CFI data to
produce accurate x86_64 back traces, and to eventually improve the
reliability of x86 back traces. The code is very much still under
development, and is not turned on as of yet; for x86_64 only, its
usage can be toggled on and off with the set command, by entering
"set unwind on" or "set unwind off". It will only work if dwarf CFI
information exists in the kernel memory, or if the vmlinux file
contains an .eh_frame section. Expect multiple iterations before
this feature is ready for prime-time.
(rachita@in.ibm.com, anderson@redhat.com)
- Prevents stream of invalid "WARNING: possibly bogus exception frame"
messages during initialization when run against x86_64 xendump
dumpfiles created with the new "xm dump-core" facility.
(anderson@redhat.com)
- Fix for the "struct -o" option to print structure member offsets if
the member type is a function pointer. (anderson@redhat.com)
(10/20/06)
4.0-3.7 - Support for paravirtualized x86_64 RHEL4 Xen kernels, which require
the use of unique hardwired kernel VM addresses, as well as a new
user vtop function. Without the patch, crash would report several
read errors during invocation, and then eventually die with this
message: "crash: cannot access phys_to_machine_mapping page".
(anderson@redhat.com)
- Fix for accessing user space stack addresses in ia64 kernels with
3-level page tables. This was a reqression introduced in 4.0-3.1,
and would cause the new "ps -a" option to fail with an error message
such as: "ps: cannot access user stack address: 60000fffffffbe28".
Also, if the user stack address was given an the argument to the
"vtop" command, it would indicate "(not mapped)".
(anderson@redhat.com)
- Implemented a new "sig -g" option, which breaks down the signal
information into a common per-thread group section, followed by
the signal information relevant to each task in the thread group.
Added the capability of using the option via "foreach sig -g".
(olivier.daudel@u-paris10.fr, anderson@redhat.com)
- Update to allow the entry of multiple "list -s struct.member"
arguments, in order to display multiple members from each structure.
Added the capability of entering a single "-s" option with multiple
members entered in a comma-separated list, i.e., using the option
format "-s struct.member1,member2,member3".
(olivier.daudel@u-paris10.fr, anderson@redhat.com)
- The refresh_hlist_task_table() and refresh_hlist_task_table_v2()
functions now recognize when the number of running tasks exceeds their
internal table size, and realloc's task space as required. Without
the patch it would be possible to not access all tasks in a live
system if the number of tasks increased (rather dramatically) from the
time that the crash session started. (anderson@redhat.com)
- Added a new hash queue tool called hq_entry_exists(). The function
may be helpful in an extension, or future patch, to query for the
existence of an entry in the current hash queue. (jmoyer@redhat.com)
(10/13/06)
4.0-3.6 - Workaround for pre-2.6.17 kernels whose vmlinux file does not
contain debug information for the "pid_hash" array. Without this
patch, the crash session would fail during initialization with the
error message: "crash: cannot determine pid_hash array dimensions".
This problem appears to be limited to kernels built with gcc
version 4.0.0, which had a known regression that omitted debug
information for uninitialized variables. (anderson@redhat.com)
(10/05/06)
4.0-3.5 - Implemented new "ps -a" option which, when available, displays the
complete command line and environment variables of selected, or all,
tasks. For example:
crash> ps -a automount
PID: 3948 TASK: f722ee30 CPU: 0 COMMAND: "automount"
ARG: /usr/sbin/automount --timeout=60 /net program /etc/auto.net
ENV: SELINUX_INIT=YES
CONSOLE=/dev/console
TERM=linux
INIT_VERSION=sysvinit-2.85
PATH=/sbin:/usr/sbin:/bin:/usr/bin
LC_MESSAGES=en_US
RUNLEVEL=3
runlevel=3
PWD=/
LANG=ja_JP.UTF-8
PREVLEVEL=N
previous=N
HOME=/
SHLVL=2
_=/usr/sbin/automount
Individual tasks may be selected in the same manner as always;
"ps -a" alone lists all tasks. (anderson@redhat.com)
- Implmented new "ps -g" option, which lists tasks by thread group,
for selected, or all, tasks. For example, to display the tasks
in the thread group containing task c20ab0b0:
crash> ps -g c20ab0b0
PID: 6425 TASK: f72f50b0 CPU: 0 COMMAND: "firefox-bin"
PID: 6516 TASK: f71bf1b0 CPU: 0 COMMAND: "firefox-bin"
PID: 6518 TASK: d394b930 CPU: 0 COMMAND: "firefox-bin"
PID: 6520 TASK: c20aa030 CPU: 0 COMMAND: "firefox-bin"
PID: 6523 TASK: c20ab0b0 CPU: 0 COMMAND: "firefox-bin"
PID: 6614 TASK: f1f181b0 CPU: 0 COMMAND: "firefox-bin"
The thread group leader will be shown first, with the other threads
indented. Individual tasks may be selected in the same manner as
always; "ps -g" alone lists all thread groups. (anderson@redhat.com)
- Fix for "timer" display; although the timer_list entries for each cpu
are correct, the "TVEC_BASES[cpu]" output was displaying incorrect
addresses for each cpu's tvec_base_t structure. (anderson@redhat.com)
(10/02/06)
4.0-3.4 - Implemented support for x86_64 and ia64 compressed kdump dumpfiles
created by the makedumpfile command, which need to pass their
respective physical address load locations in a kdump-specific
dumpfile sub-header. (oomichi@mxs.nes.nec.co.jp)
- Fix for the "timer" command on 2.6.17 and later kernels. Without this
patch, the command would spew out error messages of the sort:
timer: invalid list entry: 0
timer: ignoring faulty timer list at index 0 of timer array
This was due to the kernel's tvec_bases data structures being moved
out of the per-cpu memory regions, and replaced with just per-cpu
pointers to the data. (anderson@redhat.com)
- Fix for ia64 machines whose kernel's text and static data region 5
segment is not loaded at physical address 64MB; live systems get
the physical load address from /proc/iomem, while kdump dumpfiles
contain the load address in the ELF header. Without this patch,
the crash session would fail during initialization with a "crash:
invalid kernel virtual address: [address] type: xtime" error message.
The physical address may still be forcibly set using the command line
option "--machdep phys_start=[address]" (anderson@redhat.com)
- When using the "--machdep phys_start=[address]" on an ia64 machine,
an irrelevant error message indicating: "WARNING: invalid vm= option"
would be displayed. (anderson@redhat.com)
- Updated the ppc64 page size determination from always using
getpagesize() on the host machine to symbolically determining
whether 64k page sizes are in use. (sachinp@in.ibm.com)
- Enhancement of the "sig" command to display the lists of both private
and/or shared queued signals, if any. (olivier.daudel@u-paris10.fr)
- Adapted "mount [-n pid|task]" patch, which displays the mounted
filesystems with respect to the namespace of a given pid or task.
(olivier.daudel@u-paris10.fr)
- Fix for running crash without parameters on a live system that does
not have a "/usr/src" directory, which would result in a segmentation
violation. (holzheu@de.ibm.com)
- The /proc/version check against vmlinux "strings" output needed to be
made aware that some other character may be adjacent to the "L" in the
"Linux version..." string. This would lead to erroneous "vmlinux and
/proc/version do not match!" errors during initialization.
(holzheu@de.ibm.com)
- gdb-6.1.patch update for gdb-6.1/sim/ppc/debug.c to compile in SUSE
build environment. (olh@suse.de)
(9/19/06)
4.0-3.3 - Addressed a number of issues associated with CONFIG_SPARSEMEM
kernels and kernels using updated manners for the linkage of
their pglist_data structures, and pointers to their mem_map arrays.
(anderson@redhat.com)
- Implemented "kmem -n" for CONFIG_SPARSEMEM kernels; in addition
to the pgdat- and zone-related data command output, it also
displays a list of the SPARSEMEM mem_sections. Here is an
example from an ia64:
crash> kmem -n
NODE SIZE PGLIST_DATA BOOTMEM_DATA NODE_ZONES
0 2359296 e000000008c00000 a000000100749b70 e000000008c00000
e000000008c02400
e000000008c04800
e000000008c06c00
MEM_MAP START_PADDR START_MAPNR
e0000001040a3f00 0 0
ZONE NAME SIZE MEM_MAP START_PADDR START_MAPNR
0 DMA 262144 e0000001040a3f00 0 0
1 DMA32 0 0 0 0
2 Normal 2097152 e0000001048a3f00 100000000 262144
3 HighMem 0 0 0 0
-------------------------------------------------------------------
NR SECTION CODED_MEM_MAP MEM_MAP PFN
0 e00000010409ff00 e0000001040a3f00 e0000001040a3f00 0
1 e00000010409ff08 e0000001040a3f00 e0000001044a3f00 65536
4 e00000010409ff20 e0000001038a3f00 e0000001048a3f00 262144
5 e00000010409ff28 e0000001038a3f00 e000000104ca3f00 327680
6 e00000010409ff30 e0000001038a3f00 e0000001050a3f00 393216
7 e00000010409ff38 e0000001038a3f00 e0000001054a3f00 458752
8 e00000010409ff40 e0000001038a3f00 e0000001058a3f00 524288
9 e00000010409ff48 e0000001038a3f00 e000000105ca3f00 589824
10 e00000010409ff50 e0000001038a3f00 e0000001060a3f00 655360
11 e00000010409ff58 e0000001038a3f00 e0000001064a3f00 720896
12 e00000010409ff60 e0000001038a3f00 e0000001068a3f00 786432
13 e00000010409ff68 e0000001038a3f00 e000000106ca3f00 851968
14 e00000010409ff70 e0000001038a3f00 e0000001070a3f00 917504
15 e00000010409ff78 e0000001038a3f00 e0000001074a3f00 983040
16 e00000010409ff80 e0000001038a3f00 e0000001078a3f00 1048576
17 e00000010409ff88 e0000001038a3f00 e000000107ca3f00 1114112
18 e00000010409ff90 e0000001038a3f00 e0000001080a3f00 1179648
19 e00000010409ff98 e0000001038a3f00 e0000001084a3f00 1245184
20 e00000010409ffa0 e0000001038a3f00 e0000001088a3f00 1310720
21 e00000010409ffa8 e0000001038a3f00 e000000108ca3f00 1376256
22 e00000010409ffb0 e0000001038a3f00 e0000001090a3f00 1441792
23 e00000010409ffb8 e0000001038a3f00 e0000001094a3f00 1507328
34 e0000001040a0010 e0000001010a3f00 e0000001098a3f00 2228224
35 e0000001040a0018 e0000001010a3f00 e000000109ca3f00 2293760
crash>
(anderson@redhat.com)
- Fix for "kmem -i" failure in CONFIG_SPARSEMEM kernels that would
typically fail with the error message: "kmem: invalid kernel virtual
address: 0 type: node_zones free_pages". (anderson@redhat.com)
- Fix for "kmem -f" failure in CONFIG_SPARSEMEM kernels that would
typically fail with the error message: "kmem: invalid kernel virtual
address: ab8 type: node_zones name". (anderson@redhat.com)
- Fix for "kmem -f" failure in 2.6.17 kernels (possibly earlier) that
would fail with the error message: "kmem: invalid structure member
offset: zone_zone_mem_map". (anderson@redhat.com)
- Fix for "kmem [address]" failure in 2.6.17 kernels (possibly earlier)
that would fail with the error message: "kmem: invalid structure
member offset: zone_zone_mem_map". (anderson@redhat.com)
- Fix for "kmem -i" that resulted in a bogus "CACHED" page count
value. (anderson@redhat.com)
- As an result of the last "kmem -i" fix, I've added a new "kmem -V"
option that dumps the kernel's new vm_stat[] array contents by
their enum values:
crash> kmem -V
NR_ANON_PAGES: 38656
NR_FILE_MAPPED: 3116
NR_FILE_PAGES: 141106
NR_SLAB: 58605
NR_PAGETABLE: 1059
NR_FILE_DIRTY: 7
NR_WRITEBACK: 0
NR_UNSTABLE_NFS: 0
NR_BOUNCE: 0
NUMA_HIT: 86475467
NUMA_MISS: 0
NUMA_FOREIGN: 0
NUMA_INTERLEAVE_HIT: 31523
NUMA_LOCAL: 86475467
NUMA_OTHER: 0
crash>
Interally, a new dump_vm_stat() function has been added to access
any of the items in the list. (anderson@redhat.com)
- Implemented support for relocatable x86_64 live kernels and kdump
generated vmcores. Without this patch, attempts to analyze those
kernels would fail during initialization with the error message:
"crash: vmlinux and vmcore do not match!" (anderson@redhat.com)
- Support for recognizing real-time signals in the "sig" command.
(olivier.daudel@u-paris10.fr)
- Fix for "sys -c" display of "sys_ni_syscall" entries that showed
different system call names that have the same (W) symbol value
as the (T) symbol "sys_ni_syscall". For example:
crash> sym -l | grep ffffffff802a38b6
ffffffff802a38b6 (W) compat_sys_ipc
ffffffff802a38b6 (W) compat_sys_keyctl
ffffffff802a38b6 (W) compat_sys_sysctl
ffffffff802a38b6 (W) ppc_rtas
ffffffff802a38b6 (T) sys_ni_syscall
ffffffff802a38b6 (W) sys_pciconfig_iobase
ffffffff802a38b6 (W) sys_pciconfig_read
ffffffff802a38b6 (W) sys_pciconfig_write
ffffffff802a38b6 (W) sys_spu_create
ffffffff802a38b6 (W) sys_spu_run
ffffffff802a38b6 (W) sys_vm86
ffffffff802a38b6 (W) sys_vm86old
crash>
Depending upon the kernel, one of those symbols would be displayed
instead of sys_ni_syscall. (olivier.daudel@u-paris10.fr)
- Fix for "sig" command where in later 2.6 kernels, the queued signal
list at the end of the display would loop back on itself, repeatedly
displaying the same queued signal(s). (olivier.daudel@u-paris10.fr)
(09/07/06)
4.0-3.2 - Enabled CONFIG_SPARSEMEM support for ia64 kernels; tested on
RHEL5-alpha (2.6.17-1.2519.4.5.el5). Without this fix, crash
would fail during initialization with error message indicating:
"crash: CONFIG_SPARSEMEM kernels not supported for this architecture"
(anderson@redhat.com, dwilder@us.ibm.com)
- Moved read_in_kernel_config() to just after the internal gdb
module gets initialized. Without this fix, Xen kernels built
with CONFIG_IKCONFIG would fail during initialization indicating:
"crash: gdb_interface: gdb not initialized?"
(anderson@redhat.com, moriwaka@valinux.co.jp)
- Implemented new s390/s390x command "s390dbf" command to print out
kernel traces from the s390 debug feature (s390dbf). The debug
feature is an s390 kernel trace API which uses wraparound buffers
to store trace records in memory. Many of the s390 device drivers
use this feature. There is some documentation of the s390dbf in
the kernel sources under /Documentation/s390/s390dbf.txt.
(holzheu@de.ibm.com)
- RHEL5-alpha kernel modules (only x86_64 confirmed) may possibly
fail to be loaded with the "mod" command due to dwarf2 errors
associated with the the split module.ko/module.ko.debug debuginfo
facility used by RHEL kernels. Bugzillas have been filed to
address those problems, but the crash utility's error-reporting
mechanism has beem modified to properly reflect that the internal
gdb module has failed to load the kernel module's debug data.
Without this fix, the "mod -[sS]" commands would silently return
without loading the module data because the "add-symbol-file"
operation inside the gdb module failed, did a longjmp(), and ended
up back at the crash prompt. That behaviour has been changed
to report the module name and the gdb error like so:
crash> mod -S
mod: /lib/modules/2.6.17-1.2564.1/kernel/drivers/scsi/scsi_mod.ko
gdb add-symbol-file command failed
crash>
Note that this problem occurs in all post-RHEL4 kernels, i.e.,
FC4, FC5, and now FC6 and RHEL5.
(anderson@redhat.com)
- Fix for runaway unkillable "repeat" command output that can happen
when scrolling is turned off and the command that was entered is
bogus. (anderson@redhat.com)
- Fix for "struct structure.member address" output when the member
is an array; additional members beyond the array contents would
get displayed. (anderson@redhat.com)
- Fix to internal gdb module to properly handle relocatable kernel
virtual addresses; this will be required for upcoming relocatable
RHEL5 kernels required for the kexec/kdump facility.
(anderson@redhat.com)
- Combined kernel_init(PRE_GDB) and kernel_init(POST_GDB) into a
single call to kernel_init() that is done after gdb is initialized;
verify_version() now called by kernel_init(). This is just a code
re-work, and does not change any functionality. (anderson@redhat.com)
(8/23/06)
4.0-3.1 - Fix to address 2.6.18 and later Fedora 2.6.17-based kernel data
structure name change from "runqueue" to "rq". This would cause
crash to fail during initialization with a "crash: cannot determine
idle task addresses from init_tasks[] or runqueues[]" message,
followed by a red herring message: "crash: cannot resolve
init_task_union". (haren@us.ibm.com)
- Added 4-level pagetable support for ia64. Since this is based
upon whether the kernel was built with CONFIG_PGTABLE_4, the
determination of whether the crash utility uses 4-level page
tables is based upon one of two possibilities: the "automatic"
manner depends upon the kernel also being configured with
CONFIG_IKCONFIG; otherwise it will require the commmand line
option "--machdep vm=4l". (troy.heber@hp.com)
- Leveraging Troy Heber's addition of code to dig out and uncompress
in-kernel CONFIG_IKCONFIG data, a new "sys config" command option has
been added, which dumps all of the kernel configuration data.
(anderson@redhat.com, troy.heber@hp.com)
- Also leveraging the new CONFIG_IKCONFIG data access, the value of HZ
can now be absolutely determined by reading CONFIG_HZ. If the config
data is not available, then the current use of the HZ #define will
be replaced by the use of sysconf(_SC_CLK_TCK) to account for the
upcoming removal of HZ from glibc header files.
(anderson@redhat.com, olh@suse.de)
- Added a new "--cpus [number]" command line option to work around any
situations where the number of cpus cannot be correctly determined.
This is unlikely to ever be needed, but it was necessary for an ia64
kdexec/kdump development kernel issue that has been addressed.
However it's been left in place as a work around in case the same
thing occurs due to some other circumstance. (anderson@redhat.com)
(8/04/06)
4.0-2.33 - Fix for possible compilation error in x86_xen_kdump_load_page_PAE()
function in 4.0-2.32 version of x86.c. (anderson@redhat.com)
(7/13/06)
4.0-2.32 - Implemented and tested code to create the Xen kdump p2m table from
the mfn value found in the "pfn_to_mfn_frame_list_list" member
contained within the shared per-domain "arch_shared_info" structure,
which is contained within the architecture-neutral "shared_info"
structure. However, the use of this capability will require that:
(1) the Xen kdump implementation pass this mfn value in the vmcore
ELF header, and
(2) the crash utility will need additional updating to access this
value from the vmcore ELF header.
The current test version of the Xen kdump code passes the dom0 cr3
value in the ELF header, but that only works for Xen kernels with
writable pagetables. Using the pfn_to_mfn_frame_list_list mfn will
work for both writable- and shared-pagetable Xen kernels.
(anderson@redhat.com)
- Support for kernels with no vmalloc addresses, i.e., with an empty
"vmlist", fixing an initialization-time session failure indicating:
"crash: invalid kernel virtual address: 0 type: first vmlist addr"
(moriwaka@valinux.co.jp)
- Fix that allows the "wr" command to accept at 64-bit value.
(castor.fu@3pardata.com)
- Fix for "vtop" on user/kernel virtual addresses that showed the page
offset value on the "PAGE:" output line on x86 PAE kernels.
(anderson@redhat.com)
- Added "rd -x" option to avoid the display of the ASCII translation at
the end of each line. (anderson@redhat.com)
- Fix for unnecessary double-printing of the "mount" command header
when a directory argument is referenced by two different vfsmounts.
(harihare@vnet.ibm.com, shenlinf@cn.ibm.com)
- Fix to recognize equivalent directory arguments to the "mount"
command, i.e., "/boot" is the same as "/boot/".
(shenlinf@cn.ibm.com)
- Fix for "swap" command that dropped "/dev" from swap device pathnames
in 2.6 kernels. (shenlinf@cn.ibm.com)
- Fix for potential segmentation violation when running "bt -f" command
on s390 and s390x. (holzheu@de.ibm.com)
- Added a "rd -m machine-address" option to read Xen machine
addresses if they are accessible; also a general cleanup of the
m2p functionality. (anderson@redhat.com)
(7/12/06)
4.0-2.31 - Bumped crash-internal NR_CPUS for x86 and ia64; added a warning
message to "recompile crash" and forced an initialization failure
when the kernel's configured NR_CPUS is greater than the maximum
allowed NR_CPUS value compiled into crash.
(maneesh@in.ibm.com, anderson@redhat.com)
- Fix for initialization failure indicating a kernel/memory-source
mismatch when x86 kernel configures its physical memory start
address higher than the traditional 1MB starting point.
(anderson@redhat.com)
- Fix for kernels that have replaced the "system_utsname" data
structure with contents of the "init_uts_ns" data structure.
This fixes a "crash: cannot resolve system_utsname" initialization
failure. (pbadari@us.ibm.com, anderson@redhat.com)
- Fix for large LKCD dumpfiles that resulted in an initialization
time failure indicating "fixme, need to add more zones (ZONE_ALLOC)".
When statically-defined ZONE_ALLOC value is too small, the fix
expands the zone size dynamically. (indou.takao@jp.fujitsu.com)
- Fix for "kmem -i" failure when the "all_bdevs" block_device list
is empty. Part of the command output would be displayed, followed by
"kmem: invalid kernel virtual address: 0 type: inode buffer".
(anderson@redhat.com)
- First pass at supporting a Xen hypervisor kexec/kdump vmcore as the
dumpfile format for the dom0 vmlinux. Developed/tested OK on an x86
vmlinux/vmcore set supplied by horms@verge.net.au. Code for x86_64
is in place, but untested. (anderson@redhat.com)
- Also in place, but untested, is initial support for Xen x86 PAE
kernels. (anderson@redhat.com)
(6/27/06)
4.0-2.30 - RHEL4-U4 version. RHEL3-U8 will be (indentical) version 4.0-2.29.
- Fix for x86_64-only "vm -p" failure due to "pml page" read error
on kernels with 3-level user page tables; regression was introduced
by x86_64 Xen support in 4.0-2.24. (anderson@redhat.com)
- Fedora, and future RHEL, build procedure requires the removal of
the inclusion of certain kernel header files; removed inclusion(s)
of page.h, list.h, and segment.h. (anderson@redhat.com)
(6/06/06)
4.0-2.24 - Fix for 2.6.17 kernels that do not use "pgdat_list" memory node
list header, which would cause crash to fail during initialization
with a "crash: cannot resolve: pgdat_list" error message.
(anderson@redhat.com)
- Fix for 2.6.17 kernels that have re-worked the kernel pid_hash
handling, which would cause crash to fail during initialization
with a "crash: cannot determine pid_hash array dimensions" error
message. (anderson@redhat.com)
- If the vmlinux file and /proc/version do not match, and crash tries
to find an appropriate System.map file to use for symbol addresses,
a new "WARNING: vmlinux and /proc/version do not match!" message
will be displayed. Note that the System.map file that crash finds
will be appropriate for data symbols, but may not necessarily be
correct for text regions. When this happens, kernel text disassembly
may be incorrect, and this in turn leads to other problems, such as
incorrect back-tracing. (anderson@redhat.com)
- Fix for recent 2.6 kernels "sys" UPTIME display and for "ps -t" RUN
TIME displays due to change to HZ value. (anderson@redhat.com)
- Continued Xen support: this version runs on live x86_64 xen0 and xenU
kernels, on x86_64 xenU core dumps, and x86_64 xenU "xm save" files.
As is the case for x86 Xen, this support is only for x86_64 kernels
with writable page tables. (anderson@redhat.com)
- Fix for x86_64 IS_LAST_PML4_READ() macro, which (harmlessly) never
worked, but caused the PML4 page to be re-read each time. Added a
per-arch clear_machdep_cache() function for processors needing to do
their own virtual-to-physical page table cache clearing; so far only
ppc64 and x86_64 need it for the top-most of their 4-level page table
pages. (anderson@redhat.com)
(5/01/06)
4.0-2.23 - Fix for "kmem -[sS]" command in 2.6.15 kernels which introduced
per-NUMA node slab chains. Without this patch the command fails
with a "kmem: invalid structure member offset: kmem_cache_s_lists"
error message. (sharyath@in.ibm.com)
- Fix for this initialization error on 2.6.16 kernels indicating:
"crash: cannot determine idle task addresses from init_tasks[]
or runqueues[]" followed by "crash: cannot resolve init_task_union"
error messages. This was due to the introduction of a runqueue.cpu
member that conflicted with an old cpu member in RHEL3-specific O(1)
scheduler code. (anderson@redhat.com)
- Fix for "kmem -i" in newer 2.6 kernels where the new ZONE_DMA32 bumps
up the value of ZONE_HIGHMEM, causing a potential segmentation
violation. (anderson@redhat.com)
- Fix for "kmem -i", where the PG_slab bit determination has been fixed
so that the correct number of slab pages is displayed.
(anderson@redhat.com)
- Fix for "swap" command and "kmem -i" option on 64-bit 2.6.15 kernels
which could fail with a crash internal buffer dump followed by these
messages: "swap: cannot allocate any more memory!" or "kmem: cannot
allocate any more memory!". This was due to the swap_info_struct.max
member being downsized from a long to an int. (anderson@redhat.com)
- Continued Xen support: this version runs on live x86 xen0 and xenU
kernels, on xenU core dumps, and on xenU "xm save" files. This
support is for x86 kernels with writable page tables only. Minimal
support for running on live x86_64 xen0 kernels with writable page
tables is also in place, but does not allow access to user virtual
memory as of yet. (anderson@redhat.com)
(4/12/06)
4.0-2.22 - Incorporated initial patch-set to implement support for kernels built
with CONFIG_SPARSEMEM. (dwilder@us.ibm.com)
- Fix for post-2.6.15 ppc64 kernels to use cpu_online_map when perusing
the paca array for the per_cpu_offsets. (haren@us.ibm.com)
- Fix for ppc64 "bt" command for active tasks that were running in
user space at the time of crash. (haren@us.ibm.com)
- Fix to remove dependencies upon any kernel header files so as to
allow crash to build in a Ubuntu environment. (aquynh@gmail.com)
- Fix size of x86_64 "cpu_khz" variable to match that of the kernel.
(sharyath@in.ibm.com)
- Created framework for support of Xen kernel dumpfiles and live Xen
kernels; this is going to be a long-period work-in-progress affair,
and the code added in this release is being done now primarily to aid
in future patch integration efforts. (anderson@redhat.com)
(3/23/06)
4.0-2.21 - Fix to recognize post-2.6.15 ppc64 kernels moving the per_cpu_offsets
to the "paca" structure. Without this patch, crash fails with the
following error messages: "crash: cannot determine idle task addresses
from init_tasks[] or runqueues[]" and "crash: cannot resolve
init_task_union". (pbadari@us.ibm.com)
- Incorporated a patch containing ppc64 specific changes when reading
kdump vmcores. Kdump vmcores contain pt_regs for all cpus in the ELF
header, so they are read from there rather than from the active tasks'
kernel stacks; also, the registers contents are printed before any
active task backtrace. (haren@us.ibm.com)
- If pglist_data.node_mem_map structure member does not exist, as in a
ppc64 kernel built with CONFIG_SPARSEMEM, print an init-time warning
message instead of failing with "crash: invalid structure member
offset: pglist_data.node_mem_map" message. (haren@us.ibm.com,
anderson@redhat.com)
(2/16/06)
4.0-2.20 - Fix to recognize 2.6.16 change that removed the x86_64 cpu_pda[]
array of x8664_pda structures and replaced it with a _cpu_pda[]
array of pointers to those structures. Without the patch, crash
failed during initiatization of 2.16.16 x86_64 kernels with a
"crash: cannot resolve cpu_pda" error. (rachita@in.ibm.com)
- Added a minor enhancement to the "list" command to allow the
"start" argument to also be an (expression) that evaluates to the
address of the starting list_head; previously it only allowed
a symbol or a virtual address. (anderson@redhat.com)
(2/03/06)
4.0-2.19 - Fix for the "bt" command on ia64 kernels with 64K page size.
(1/11/06)
4.0-2.18 - Fix for the "files" command for 2.6.14 and later kernels, in which
the files_struct data structure contains the new fdtable data
structure. (rachita@in.ibm.com)
- Fix for an "invalid lvalue in assignment" compile-time error
generated from gdb-6.1/bfd/coff-alpha.c that prevents the embedded
gdb from building with newer compilers. (troy.heber@hp.com)
(1/5/06)
4.0-2.17 - Fix to resurrect LKCD version 8 support, inadvertently broken in
4.0-2.15. (troy.heber@hp.com)
- Fix for "net -S" failures in certain 2.6 kernels that failed with
"net: cannot determine what an inet_sock structure is" message;
shows embedded sock structure instead of failing. (anonymous donor)
- Fix for erroneous "net -s" source/destination address and port
values in certain 2.6 kernels; added "net -s" source/destination
address and port values for IPv6 sockets. (anderson@redhat.com)
(12/16/05)
4.0-2.16 - Fix for the x86_64 backtrace code to search all of the exception
stacks for the origin of the active tasks' backtrace when the
information is not available in the dumpfile header. Up until now,
the search was made in the process stack, the per-cpu IRQ stack,
and the per-cpu NMI exception stack; this patch looks at all 3
exception stacks in 2.4 kernels (NMI, STACKFAULT and DOUBLEFAULT),
and all 5 exception stacks in 2.6 kernels (NMI, STACKFAULT,
DOUBLEFAULT, DEBUG and MCE).
- Fix to remove erroneous warning message re: the task cpu not being
the same as the IRQ or exception stack cpu, which was displayed when
doing a non-context-sensitive "bt -E" on an x86_64.
(12/12/05)
4.0-2.15 - Applied Kurt Rader's (kdrader@us.ibm.com) patch for SUSE SLES 9
"bigsmp" kernel LKCD dumpfiles, to fix "conflicting page" abort
caused by a dumpfile header that is larger than the formerly
hard-wired header size.
- Fix for ppc64-only segmentation violation when running "bt" on the
panic task when run against a dumpfile created by the diskdump
facility's new compressed format.
(12/02/05)
4.0-2.13 - Adapted Takao Indoh of Fujitsu's patch for determining proper size
of the ia64_init_stack; fixes empty ia64 "bt -a" output for cpu 8 and
above for diskdumps generated via OS_INIT.
- Applied a patch to address a "net -s" error due to the inet_opt
structure being dropped between 2.6.10 and 2.6.11, which led to a
"net: invalid structure member offset: inet_opt_daddr" failure.
- Made the initialization-time rule such that if "bt -O" is contained
in any or all of the 3 possible initialization-time input files
($HOME/.crashrc, ./.crashrc, or "-i inputfile" files), the setting
will remain idempotent. Fixed the redundant running of $HOME/.crashrc
and ./.crashrc files if they are the same file.
- Added a gdb work-around/hack for ia64 initialization-time warning
"WARNING: cannot determine unw.tables offset" on rebuilt RHEL3 ia64
kernels that would prevent "bt" from working.
- Backed out 4.0-2.11 x86_64 pseudo-backtrace patch to show in-kernel
exception frame RIP and RSP values as a unique frame following the
register dump; instead, the exception RIP address is translated
and displayed prior to the register dump.
(11/23/05)
4.0-2.12 - Update to diskdump page_desc struct, required for ongoing support
of the diskdump facility's compression feature, currently under
development.
- Applied patch from Ken'ichi Ohmichi of NEC to prevent a segmentation
violation during a "bt -f" on an x86_64 task that had taken a NMI
during cpu_idle().
- Adapted Badari Pulavarty's patch for recognition of recent 2.6.14
kernel structure/member name changes: mm_struct._rss to _file_rss,
and the kmem_cache_s structure's renaming to kmem_cache. Without
the patch, crash sessions would fail during initialization with an
"crash: invalid structure member offset: kmem_cache_s_num" error,
and the "ps" command would fail with a "ps: invalid structure member
offset: mm_struct_rss" error.
(11/15/05)
4.0-2.11 - Adapted a number of proposed patches:
- Badari Pulavarty of IBM's implentation of support for 2.6.14
ppc64 kernel's use of 4-level page tables.
- Added a new "extensions" sub-directory for collecting crash
command extension libaries; initially populated with the sample
"echo.c" from the extend help page, along with a device-mapper
related "dminfo.c" module from NEC.
- Castor Fu of 3PAR's implementation of support for LKCD version 10,
as well the handling of single-bit errors in LKCD compressed
pages by trying out all possible single-bit errors. Also his
fixes for better recognizing -fomit-frame-pointer kernel builds,
a stronger defense against potential bogus processor numbers
associated with tasks in dumpfiles, and a fix to re-allow crash
builds for gcc 2.x compilers.
- Fix for potential "vmcore: initialization failed" fatal error during
initializaton when using more than just a vmlinux and vmcore command
line arguments.
- Fix for diskdump.c compile failures using gcc 2.96.
Update to the x86_64 pseudo-backtrace code to show as a frame the
RSP, RIP and name of the function causing a kernel-mode exception
frame.
- Fix for the x86_64 pseudo-backtrace code to not neglect to show the
user-mode exception frame when that task subsequently took a
kernel-mode exception.
Exported the load_extension() and unload_extension() functions so
that they can be called from an extension library.
(11/10/05)
4.0-2.10 - Adapted a patch set created by Badari Pulavarty of IBM, that
addresses a fatal initialization-time crash error, which displays
"crash: invalid structure member offset: x8664_pda_level4_pgt"
when run against post-2.6.10 x86_64 kernels. But more importantly,
Badari's patch adds support for these x86_64 kernel changes that
were introduced in 2.6.11:
- x86_64 kernel virtual address range changes, and
- x86_64 user virtual address space usage of 4-level page tables
(11/07/05)
4.0-2.9 - Adapted a patch set from NEC and Fujitsu that introduces support
for an alternative compressed dumpfile format created by the
diskdump facility. When the diskdump facility is configured to
use compression, the dumpfile will not be an ELF vmcore file,
but rather a compressed dumpfile image, derived from the LKCD
dumpfile format.
(11/03/05)
4.0-2.8 - Adapted a patch sent by Jun'ichi Nomura of NEC that addresses
a problem with the "mod" command, such that when trying to load
the debug data from a module whose kernel name is different than
its module object filename, it would require a manual "mod -s"
command line containing the full pathname to the module's object
file. This typically happens when a module's name string contains
an underscore, while its object file contains a dash. Jun'ichi's
patch simply retries any unsuccessful module object file searches
after replacing the underscore with a dash.
(10/21/05)
4.0-2.7 - Fixed x86_64 backtrace code to recognize 32-bit user code kernel
entry exception frames (code segment selectors of 0x23) without
issuing a "bt: WARNING: possibly bogus exception frame" message.
- Fixed x86_64 backtrace code to recognize in-kernel exception
frames generated from module text in situations where the module
data was not included in the dumpfile, such as in a netdump which
resulted in a vmcore-incomplete file.
(10/19/05)
4.0-2.6 - Backed out support for the proposed NT_KDUMPINFO ELF notes section
in kexec/dump vmcores (which have been rejected upstream for now).
- Fix for faulty backtrace display of exception frames coming out of
either "nmi" or the generic "error_code" fault handlers, as seen in
later 2.6 kernels.
- Restored "dev -i" and "dev -p" options for x86, which I mistakenly
removed when the s390[x] support was added in 3.10-13.2.
(9/29/05)
4.0-2.5 - Continued support for kexec/kdump generated vmcore files, with
this release running against SMP i386 dumpfiles (32-bit ELF),
which contain multiple, per-cpu, NT_PRSTATUS sections, and also
containing support for the proposed NT_KDUMPINFO ELF notes section.
- Implemented new "bt -T" option to supplement "bt -t", the difference
being that the -T option dumps all text addresses in a process stack
starting just above the task_struct or thread_info structure,
whichever applies; whereas "bt -t" starts where it determines is
the lowest depth that the stack had reached during the task's last
entry into the kernel.
- Fix for "bt -r" output on 2.6.13 kernels where certain addresses
were recognized as kernel addresses, but could not be translated
symbolically (also affected "rd -s" output).
(9/20/05)
4.0-2.4 - Initial support for kexec/kdump generated vmcore files. So far
testing has only been done on uniprocessor i386 32-bit and 64-bit
ELF header dumpfiles; expect ongoing kdump support updates.
- Fix for "ps: invalid structure member offset: mm_struct_rss"
command failures on 2.6.13 kernels.
(9/09/05)
4.0-2.3 - Update to recognize the contents of the 2.6 kernel's shared
array_cache in each kmem_cache_s header:
- kmem -S will show an individual object in a slab cache's shared
cache as a free object tagged with "(shared cache)", instead
of indicating that the object is allocated.
- kmem -[sS] statistics will count the shared objects as free
instead of allocated.
- kmem -[sS] error checking has been updated to recognize when
an object is erroneously on more than one of the three possible
list types, i.e., a slab's free list, any of the cache's per-cpu
lists, and the caches's shared list.
(8/30/05)
4.0-2.2 - Fixes inadvertent breakage of the kmem_cache initialization
code on 2.4 UP kernels, which would indicate "crash: unable to
initialize kmem slab cache subsystem" during crash session init.
The bug was introduced in 4.0-2.1, and would disallow subsequent
"kmem -s" operations.
(8/11/05)
4.0-2.1 - This update consists of a set of SUSE-kernel related changes.
Based upon a suggestion from Kurtis Rader of IBM, we made the
initialization of the kmem_cache slab subsystem able to more
gracefully handle:
- missing pages in dumpfiles which could cause the crash
session to bail out prematurely with an "seek" error on
an "array cache limit" access.
- x86_64 dumpfiles from kernels that have NR_CPUS set to
greater than 32, which would cause a segmentation violation.
- Kurtis Rader also sent in a patch for kmem-related commands that
recognizes SUSE's replacement of the zone_struct.zone_start_paddr
field with the zone_struct.zone_start_pfn field.
- Kenneth Sumrall of MontaVista Software sent in a patch that makes
the timer command, after coming across a corrupted list, continue
dumping the remaining timer lists.
(8/10/05)
3.10-13.11 - Adapted a patch set forwarded by Bob Bell of EMC to support
LKCD dumpfile version 9 format.
(7/7/05)
3.10-13.10 - Several x86_64 "bt" command fixes that address the following
occasionally-seen bugs:
- Double display of user entry exception frame.
- Skipping of an in-kernel exception frame display.
- Invalid "possibly bogus exception frame" associated with a
valid in-kernel exception frame.
- Potential premature stopping of backtrace frame display due to
a stale "schedule_timeout" return address.
(6/16/05)
3.10-13.9 - Michael Holzheu of IBM forwarded a set of patches that address
the following issues:
- Fix for "ps -t" stime and utime for recent 2.6 kernels.
- Align the output of the "mod" command for correctly for both
32 and 64 bit.
- If there is specified a search pattern for the mount command,
which matches multiple mount entries, print all mount entries
which match. Fix for mount -f or -i is specified parsing problem
if a search pattern has also been specified.
- Alight the output of the address field of the "sig" command.
- Print error message if 0 is entered as the argument to bt -S.
- Added recognition of new diskdump ELF note section, and based upon
its contents, indicate whether a diskdump-generated dumpfile is
a partial dump (i.e., with configured-out user pages, page cache
pages and/or zero-filled pages).
(6/08/05)
3.10-13.8 - Fix for possible "cannot determine mount list location!" error
when running "mount" command on a 64-bit system because gdb cannot
find debug data for namespace structure. Update to the "struct"
command to allow a negative argument to the -c option; if a negative
count value is entered, that (positive) count of structures leading
up to and including the target structure will be displayed.
(5/17/05)
3.10-13.7 - Michael Holzheu of IBM forwarded a set of patches that address
the following issues:
- No "bt" is allowed on s390[x] running tasks on a live system.
- The "bt -I [eip]" option is not allowed on s390[x] systems.
- The help page for "bt" indicates that multiple pid and/or task
arguments may be entered.
- Fix for possible segmentation violation if "files -d [dentry]"
is given a random dentry address argument.
- "kmem -v" formatting fix for s390x.
- Fix for division by zero violation when "kmem -i" is run on a
system with no swap.
(5/06/05)
3.10-13.6 - Two fixes for "kmem -s [slab-address]": one where the slab's inuse
count would be reported as incorrect (bug introduced in 3.10-13.5),
and a second fix to display the proper slab statistics. Fixed both
btop and ptob commands to handle 64-bit values in 32-bit system.
Fix for "net -s" in 2.6 in which gdb cannot properly determine
the contents of an inet_sock structure.
(5/02/05)
3.10-13-5 - Restored gdb-6.1 as the embedded gdb; a fix to gdb's dwarf2read.c
allows proper access debug data from vmlinux kernels built with
gcc-3.4.*, and the proper loading of module debug data. The
init-time warning messages re: --readnow has been removed, and
will only be shown if a required structure member or structure size
cannot be determined.
- First pass at handling diskdump-generated 64-bit vmcore files with
multiple PT_LOAD segments.
- Fix for x86_64 kernel exception frame recognition; bt was showing
"error_exit" symbol in trace without accompanying exception frame.
- Additional "slabs_full" contents verification of c_num field in
2.4 kernels; kmem -s would bypass a bogus slab_s with an invalid
inuse field.
(4/26/05)
3.10-13.3 - Fix for case in which a netdump's panic task is dead, had
called do_exit(), which in turn has called schedule(). It is
kernel bug for the task to be rescheduled and return back to
do_exit() from schedule(), and if it does the kernel does a
BUG() to force an oops. The crash utility never expected to
see this anomoly, and would bail out during initialization
with a "crash: task does not exist: [task address]" message.
- Fix to allow running with 2.6 x86_64 kernels in which CONFIG_NR_CPUS
is 8, on a system with 8 cpus. The system would fail initialization
with a message of the sort: "crash: read error: kernel virtual
address: 20000800403acd23 type: tss_struct ist array".
(4/12/05)
3.10-13.2 - Introduces s390 and s390x support, submitted by Michael Holzheu
of IBM. Both LKCD and s390 standalone dumpfile formats are
supported.
(3/23/05)
3.10-13.1 - Fix for 2.6 kernels with "linux_banner" located in read-only
data section, to fix initialization failure indicated by
a "WARNING: invalid linux_banner pointer: 756e694c" message,
followed by an "invalid kernel virtual address: 756e694c",
and then a "bad match" failure.
- Addressed several type-check warnings generated when compiling
with gcc 4.
- Clean up %build-root after an rpmbuild without the install step.
(3/18/05)
3.10-11 - Enhanced x86, x86_64 and ia64 module text disassembly output
to symbolically display call targets without requiring module
debuginfo data.
- Fixed hole where an ia64 vmcore could be mistakenly accepted
as a usable dumpfile on an x86_64 host, leading eventually
to a non-related error message.
- Fixed potential "bt -a" hang on dumpfile where diskdump/netdump
IPI interrupted an x86 process on a 4g/4g kernel while executing
the instructions just after it had entered the kernel for a
syscall, but before calling the handler.
- Updated to handle backtraces from x86_64 dumpfiles generated
while running on the NMI exception stack.
- Applied patch from Troy Huber of HP to fix faulty ia64 module
text disassembly output.
(2/21/05)
3.10-3.1 - Fortified "kmem -[sS]" verification of slab chain linkage and
slab structure contents, and to report any errors found; this
prevents potential segmentation violations or command hangs
when performing a kmem -[sS] command on a dumpfile with slab
corruption.
- Adapted a patch from Jun'ichi Nomura of NEC that properly displays
backtraces from netdump/diskdump dumpfiles generated from INIT
switches on ia64 machines; the kernel must have per-cpu INIT stacks.
(12/17/04)
3.10-1 - Fix for segmentation violation during initialization seen on a
2.6 x86_64 SMP kernel run on a system running with "maxcpus=1".
Fix for "bt" on the panic task in a 2.6 x86_64 netdump, due to
the user_regs_struct debug data not being gathered -- even with
the retrofitted gdb-6.0 (yet another issue associated with gdb
and kernels built with gcc-3.4.*).
- Fix for the "mod -[sS]" command for ppc64 on kernels built with
gcc-3.4.*. It should be noted that "mod -[sS]" does not work on
ia64 and x86_64 kernels built with gcc-3.4.*, as there is no
version of gdb that can properly handle ia64 and x86_64 kernel
module objects built with compiler versions starting with 3.4.*.
Hopefully that shortcoming will be addressed in a future version
of gdb.
(11/24/04)
3.9-1 - In a interim attempt to deal with the current version of gdb
not being able to properly access debug data from vmlinux
kernels built with gcc-3.4.*, I have reverted the embedded
gdb version from gdb-6.1 back to gdb-6.0. (I will keep version
3.8-5.11 available in old_versions/, which is the last version
with gdb-6.1 embedded.) The gdb team at Red Hat is aware of the
problem, and when a new version of gdb that works is available
either at the FSF site or internal to Red Hat, I will upgrade
again. In any case, with gdb-6.0, the initialization-time
"using an invalid structure member offset" errors should not
occur; if they do, the warning message concerning the use of
the command line "--readnow" option should be applied.
- Other fixes in this release address:
- the occasional failures of the "timer" command failing with
"zero-size memory allocation!" errors.
- a failure of the "bt" for the 2.6 migration_thread.
- ia64 /dev/mem read failures when a page staddles an EFI memory
segment.
- the removal of unnecessary ia64 "unwind" warning messages running
"bt" on 2.6 kernels.
(11/18/04)
3.8-5.11 - No functional changes or fixes are in this release. However,
the error reporting mechanism for attempts to use an invalid
structure member offset or invalid structure size has been
beefed up to additionally report the invalid data type item,
along with the function, filename and line number. Up until
now, only a rudimentary back trace has been displayed.
(10/29/04)
3.8-5.10 - Fix for a failure to properly determine the correct number
of cpus in an ia64 netdump dumpfile from an SMP kernel running
on a single-processor system.
(10/26/04)
3.8-5.9 - Fix for potential segmentation violation during "mod -S" due
to an overrun of what was a statically-defined bfd section data
array. Cleaned up bogus error message appearing at the top
of the README file.
- Fixed two uninitialized variable usages.
(10/26/04)
3.8-5.8 - Fix for newer 2.6 ia64 kernels whose in-kernel "unw" structure
contains the new "r0" member, which was causing backtraces to
fail with "kernel and crash unwind data structures are out of sync"
messages.
- Fixed ia64 register dumps in backtraces to properly display the
floating point register contents, as well as the additional
F9 and F10 registers from kernels with newer pt_regs structures
that contain them.
(10/15/04)
3.8-5.7 - Introduced support for ia64 LKCD dumpfiles, as well as several
other LKCD-related fixes, submitted by Troy Heber of Hewlett-Packard.
- Fix for x86 backtrace if an IRQ is received on a CPU that has just
entered the kernel via system_call but has not yet called the system
call hander, running on a 4g/4g kernel.
- Updated ExclusiveArch in crash.spec file to include both ppc64pseries
and ppc64iseries.
(10/14/04)
3.8-5.6 - Continued ppc64 support work for 2.6 diskdump and netdump
facilities handling, submitted by Haren Myneni of IBM.
(10/01/04)
3.8-5.5 - Continued ppc64 support, submitted by Haren Myneni of IBM, to
handle ppc64 netdump-generated dumpfiles, to find and display
the active backtraces from dumpfile info, to handle 2.6 IRQ stacks,
and to fix an endian issue associated with the kmem command.
- Fix for x86_64 to deal with recent 2.6 removal of the init_tss
array, which has been replaced with per-cpu tss_structs; fixes
"cannot resolve init_tss" error during initialization.
(9/29/04)
3.8-5.4 - Implement support for recently-introduced PID hashing scheme in
which pid_hash[] is now an array of hlist_head pointers that
head a list of hlist_node structures; fixes "using an invalid
structure member offset" crash initialization failure.
(9/10/04)
3.8-5.3 - Makes NR_CPUS processor-specific, for the most part based upon
their configurable maximums; fixes segmentation violation during
intialization when crash's NR_CPUS was less than the kernel's
configured value.
- Updated time-related issues to deal with 2.6 change of task_struct's
start_time from an unsigned long to a u64, and kernel HZ difference
from user-space view; still there are problems to address, such as
the system's uptime display, and the ps -t output of the initial
swapper process.
(9/3/04)
3.8-5.2 - Accept "--readnow" crash command line argument, which gets passed
on to the embedded gdb module. This may help alleviate the
gdb-6.1/gcc-3.4.x debug data problem on some architectures.
(8/31/04)
3.8-5.1 - Introduces ppc64 support, submitted by Haren Myneni of IBM.
(8/24/04)
3.8-5 - Snapshot of the tentative target for RHEL4/FC3:
- Fix for possible ia64 "bt -al" segmentation violation on the
idle task ("swapper") running on other than the boot processor.
- Fix for ia64 build issues when compiled in a 2.6 environment.
- Fix for incorrect presumption that a vmlinux file has no
debugging data when compiled with gcc 3.4.x.
- Clean up of lkcd_x86_trace.c's gcc version-specific kludgery.
NOTE: The for gdb 6.1's inability to gather debug data from vmlinux
files built with gcc 3.4.x does not work in all cases; the gdb team
is looking into the issue now.
(7/14/04)
3.8-3 - No change -- version 3.8-3 is a Red Hat internal snapshot of 3.8-2.2.
Version 3.8-3 is tentatively targeted for the RHEL3-U3 release,
so this is simply a version-sync.
(6/28/04)
3.8-2.2 - Minor changes:
- Fixes gcc 3.3.3 compiler warning regarding the use of 64-bit
bitmap #define's that didn't have "ULL" appended.
- Presumes backtrace compatibility with gcc 3.3.2 for now.
- Added another Red Hat kernel search directory; the current one for
AS2.1/RHEL3 is: "/usr/src/redhat/BUILD/kernel-2.x.x/linux".
For RHEL4 kernels it has been changed to be of the form:
"/usr/src/redhat/BUILD/kernel-2.x.x/linux-2.x.x". This will
allow a no-argument live crash session to be initiated on a
system hosting the kernel's build environment without having
to install the kernel debuginfo package.
(6/25/04)
3.8-2.1 - Fixes a flurry of bugs related to 2.6 kernel changes:
- The initialization of kmem_cache subsytem was causing a crash
invocation failure on UP systems; fixed bug, but also added a
"--no_kmem_cache" command line option to skip over kmem_cache
slab subsystem initialization, since there's no need to die if
the slab subsystem undergoes changes.
- Fix to handle new kmem_bufctl_t typedef, which has always been
an int, but is now a processor-dependent int or short. This
resulted in "kmem -S" causing a segmentation violation.
- Fix to deal with structure member name change from page->count
to page->_count. This was causing various kmem command options
to fail.
(6/22/04)
3.8-2 - Introduces support for the diskdump facility on ia64 platforms
(6/17/04)
3.8-1 - Introduces x86_64 support.
- Also, updates from Corey Minyard to support the 2.6 LKCD dumpfile
format were applied.
(6/02/04)
3.8-0 - First crash release containing FSF gdb-6.1, replacing Red Hat version
gdb-5.3post-0.20021129.36rh.
(5/04/04)
3.7-5.4 - First crash release that supports the 2.6 kernel.
(4/22/04)