summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAge
...
| * | | | [PATCH] Print version and command line in debugging messageKazuhito Hagio2019-12-10
| | | | | | | | | | | | | | | | | | | | | | | | | Suggested-by: John Donnelly <john.p.donnelly@oracle.com> Signed-off-by: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
| * | | | [PATCH] Makefile: remove -lebl from LIBS when no libebl.aPingfan Liu2019-12-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since the following commit, -lebl has been removed from elfutils. (elfutils-0.178 or later contains the commit.) commit b833c731359af12af9f16bcb621b3cdc170eafbc Author: Mark Wielaard <mark@klomp.org> Date: Thu Aug 29 23:34:11 2019 +0200 libebl: Don't install libebl.a, libebl.h and remove backends from spec. All archive members from libebl.a are now in libdw.a. We don't generate separate backend shared libraries anymore. So remove them from the elfutils.spec file. Signed-off-by: Mark Wielaard <mark@klomp.org> Without the patch, building process fails with the following error. /usr/bin/ld: cannot find -lebl collect2: error: ld returned 1 exit status make: *** [makedumpfile] Error 1 So remove it from LIBS for makedumpfile when elfutils does not have libebl.a. Signed-off-by: Pingfan Liu <piliu@redhat.com> Signed-off-by: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
| * | | | [PATCH] Add support for ELF extended numberingKazuhito Hagio2019-11-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In ELF dump mode, since makedumpfile cannot handle more than PN_XNUM (0xFFFF) program headers, if a resulting dumpfile needs such a number of program headers, it creates a broken ELF dumpfile like this: # crash vmlinux dump.elf ... WARNING: possibly corrupt Elf64_Nhdr: n_namesz: 4185522176 n_descsz: 3 n_type: f4000 ... WARNING: cannot read linux_banner string crash: vmlinux and dump.elf do not match! With this patch, if the actual number of program headers is PN_XNUM or more, the e_phnum field of the ELF header is set to PN_XNUM, and the actual number is set in the sh_info field of the section header at index 0. The section header is written just after the program headers, although this order is not typical, for the sake of code simplisity. Signed-off-by: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
| * | | | [PATCH] Fix wrong statistics in ELF format modeKazuhito Hagio2019-11-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The -E option, which creates a dumpfile in ELF format, reports wrong statistics like the ones below, because: (1) counts excluded pages repeatedly due to overlapped cycles (2) does not calculate the number of memory hole pages in cyclic mode (3) does not take account of the number of pages excluded actually in ELF format, which excludes only contiguous 256 or more pages that can be excluded. Original pages : 0x0000000000000000 Excluded pages : 0x00000000006faedd Pages filled with zero : 0x00000000000033d1 Non-private cache pages : 0x0000000000046ff6 Private cache pages : 0x0000000000000001 User process data pages : 0x00000000000144bb Free pages : 0x000000000069c65a Hwpoison pages : 0x0000000000000000 Offline pages : 0x0000000000000000 Remaining pages : 0xffffffffff905123 Memory Hole : 0x0000000000440000 -------------------------------------------------- Total pages : 0x0000000000440000 In order to fix this issue: (1) start first cycle from the start pfn of a segment rounded by BITPERBYTE to reduce overlaps between cycles (2) calculate the number of memory hole pages in cyclic mode (3) introduce pfn_elf_excluded variable to store the actual number of the excluded pages in ELF format With the patch, a report message in ELF format mode becomes like this: Original pages : 0x00000000003f1538 Excluded pages : 0x00000000003c866b in ELF format : 0x00000000003ba100 Pages filled with zero : 0x0000000000002cd8 Non-private cache pages : 0x0000000000046ff6 Private cache pages : 0x0000000000000001 User process data pages : 0x00000000000144bb Free pages : 0x000000000036a4e1 Hwpoison pages : 0x0000000000000000 Offline pages : 0x0000000000000000 Remaining pages : 0x0000000000028ecd in ELF format : 0x0000000000037438 (The number of pages is reduced to 5%.) Memory Hole : 0x000000000004eac8 -------------------------------------------------- Total pages : 0x0000000000440000 where the "Excluded pages" and "Remaining pages" do not mean the actual numbers of excluded and remaining pages. But remain the same for some reference. Signed-off-by: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
| * | | | [PATCH] Fix off-by-one issue in exclude_nodata_pages()Mikhail Zaslonko2019-11-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When building a dump bitmap (2nd bitmap) for the ELF dump, the last pfn of the cycle is always ignored in exclude_nodata_pages() function due to off-by-one error on cycle boundary check. Thus, the respective bit of the bitmap is never cleared. That can lead to the error when such a pfn should not be dumpable (e.g. the last pfn of the ELF-load of zero filesize). Based on the bit in the bitmap the page is treated as dumpable in write_elf_pages_cyclic() function and the follow on error is triggered in write_elf_load_segment() function due to the failing sanity check of paddr_to_offset2(): $ makedumpfile -E dump.elf dump.elf.E Checking for memory holes : [100.0 %] | write_elf_load_segment: Can't convert physaddr(7ffff000) to an offset. makedumpfile Failed. Signed-off-by: Mikhail Zaslonko <zaslonko@linux.ibm.com>
| * | | | [PATCH] Fix divide by zero in print_report()Dave Jones2019-09-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If info->max_mapnr and pfn_memhole are equal, we divide by zero when trying determine the 'shrinking' value. On the system I saw this error, we arrived at this function with info->max_mapnr:0x0000000001080000 pfn_memhole:0x0000000001080000 Change the code to only print out the shrinking value if it makes sense. Signed-off-by: Dave Jones <davej@codemonkey.org.uk>
| * | | | [PATCH] Improve performance for non-thread compression with zlibKazuhito Hagio2019-09-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, write_kdump_pages_cyclic() function uses compress2() to compress each page with zlib, but it internally allocates/frees some memory, and can cause some brk() calls. This can be inefficient. Using deflate() via compress_mdf() we already have instead of it improves the performance by about 30% for certain vmcores. Without this patch: # /bin/time makedumpfile -c -d 1 vmcore dump.cd1 ... 19.13user 5.93system 0:25.06elapsed 99%CPU (0avgtext+0avgdata 34836maxresident)k 0inputs+606824outputs (0major+4712478minor)pagefaults 0swaps With this patch: # /bin/time makedumpfile -c -d 1 vmcore dump.cd1 ... 17.90user 0.57system 0:18.48elapsed 99%CPU (0avgtext+0avgdata 34744maxresident)k 0inputs+606792outputs (0major+499821minor)pagefaults 0swaps Signed-off-by: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
| * | | | [PATCH] Cleanup: Remove unnecessary len_buf_out_* variablesKazuhito Hagio2019-09-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use calculate_len_buf_out() in write_kdump_pages_cyclic() function to remove redundant code. Also replace len_buf_out_snappy with len_buf_out because the latter is equal to or bigger than the former if USESNAPPY is defined. Signed-off-by: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
| * | | | [PATCH] Fix inconsistent return value from find_vmemmap()Kazuhito Hagio2019-09-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When -e option is given, the find_vmemmap() returns FAILED(1) if it failed on x86_64, but on architectures other than that, it is stub_false() and returns FALSE(0). if (info->flag_excludevm) { if (find_vmemmap() == FAILED) { ERRMSG("Can't find vmemmap pages\n"); #define find_vmemmap() stub_false() As a result, on the architectures other than x86_64, the -e option does some unnecessary processing with no effect, and marks the dump DUMP_DH_EXCLUDED_VMEMMAP unexpectedly. Also, the functions for the -e option return COMPLETED or FAILED, which are for command return value, not for function return value. So let's fix the issue by following the common style that returns TRUE or FALSE, and avoid confusion. Signed-off-by: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
| * | | | [PATCH] Fix exclusion range in find_vmemmap_pages()Kazuhito Hagio2019-08-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the function, since pfn ranges are literally start and end, not start and end+1, if the struct page of endpfn is at the last in a vmemmap page, the vmemmap page is dropped by the following code, and not excluded. npfns_offset = endpfn - vmapp->rep_pfn_start; vmemmap_offset = npfns_offset * size_table.page; // round down to page boundary vmemmap_offset -= (vmemmap_offset % pagesize); We can use (endpfn+1) here to fix. Signed-off-by: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
| * | | | [PATCH] x86_64: Fix incorrect exclusion by -e option with KASLRKazuhito Hagio2019-08-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The -e option uses info->vmemmap_start for creating a table to determine the positions of page structures that should be excluded, but it is a hardcoded value even with KASLR-enabled vmcore. As a result, the option excludes incorrect pages from it. To fix this, get the vmemmap start address from info->mem_map_data. Signed-off-by: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
| * | | | [PATCH] arm64: fix get_kaslr_offset_arm64() to return kaslr_offset correctlyKazuhito Hagio2019-07-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, the get_kaslr_offset_arm64() function has the following condition to return info->kaslr_offset, but kernel text mapping is placed in another range on arm64 by default, so it returns 0 for kernel text addresses. if (vaddr >= __START_KERNEL_map && vaddr < __START_KERNEL_map + info->kaslr_offset) Consequently, kernel text symbols in erase config are resolved wrongly with KASLR enabled vmcore, and makedumpfile erases unintended data. Since the return value of get_kaslr_offset_arm64() is used in resolve_config_entry() only, and in that case, we must have a vmlinux, so get the addresses of _text and _end from vmlinux and use them. Signed-off-by: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
| * | | | [PATCH] Increase SECTION_MAP_LAST_BIT to 4Kazuhito Hagio2019-07-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kernel commit 326e1b8f83a4 ("mm/sparsemem: introduce a SECTION_IS_EARLY flag") added the flag to mem_section->section_mem_map value, and it caused makedumpfile an error like the following: readmem: Can't convert a virtual address(fffffc97d1000000) to physical address. readmem: type_addr: 0, addr:fffffc97d1000000, size:32768 __exclude_unnecessary_pages: Can't read the buffer of struct page. create_2nd_bitmap: Can't exclude unnecessary pages. To fix this, SECTION_MAP_LAST_BIT needs to be updated. The bit has not been used until the addition, so we can just increase the value. Signed-off-by: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
| * | | | [PATCH] Do not proceed when get_num_dumpable_cyclic() failsKazuhito Hagio2019-07-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, when get_num_dumpable_cyclic() fails and returns FALSE in create_dump_bitmap(), info->num_dumpable is set to 0 and makedumpfile proceeds to write a broken dumpfile slowly with incorrect progress indicator due to the value. It should not proceed when get_num_dumpable_cyclic() fails. Signed-off-by: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
* | | | | Release 1.6.6-4 to unstableThadeu Lima de Souza Cascardo2020-03-02
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@debian.org>
* | | | | Set Rules-Requires-Root to noThadeu Lima de Souza Cascardo2020-03-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is no need for root to build the package, and all files belong to root anyway. dh_builddeb will end up calling dpkg-deb with --root-owner-group, which will do the right thing. A test build resulted in the same package contents, the binary packages were reproducible bit-by-bit by changing only that. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@debian.org>
* | | | | udev: hotplug: use try-reloadThadeu Lima de Souza Cascardo2020-03-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We should not reload kdump unconditionally after a hotplug event, but only reload it when it was loaded already, which is what try-reload does. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@debian.org>
* | | | | kdump-config: implement try-reloadThadeu Lima de Souza Cascardo2020-03-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use a lock file to allow try-reload to be run concurrently, and verify that kdump is loaded before trying to unload and load again. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@debian.org>
* | | | | Let the kernel decide the crashkernel offset for ppc64el (LP: #1741860)Thadeu Lima de Souza Cascardo2020-03-02
| | | | |
* | | | | Release 1.6.6-3 to unstableThadeu Lima de Souza Cascardo2020-03-02
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@debian.org>
* | | | | Use reset_devices as a cmdline parameter.Thadeu Lima de Souza Cascardo2020-03-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | reset_devices will be used by some drivers to do a special reset during kdump. This will allow some systems with some devices that use such drivers to kdump instead of fail to probe them. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@debian.org>
* | | | | Use kdump-config reload after cpu or memory hotplug.Thadeu Lima de Souza Cascardo2020-03-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The solution used to allow multiple reloads during a hotplug event ended up not working because udev won't execute two commands under a shell. So, using a single command that reloads and do not interact with systemd should work here. As an extra, this adds support for other init systems. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@debian.org>
* | | | | Add a reload command.Thadeu Lima de Souza Cascardo2020-03-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kdump-config reload will unload the current kdump kernel, and load a new one. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@debian.org>
* | | | | Release 1.6.6-2 to unstableThadeu Lima de Souza Cascardo2020-03-02
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@debian.org>
* | | | | Allow proper reload of kdump after multiple hotplug events.Thadeu Lima de Souza Cascardo2020-03-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a CPU is hotplugged, multiple events will be issued for each CPU thread getting online. On a POWER system, that usually means 8 threads. Those 8 events will cause systemd to consider the multiple restarts as failed. One alternative fix would be setting StartLimitIntervalSec to 0, but that would apply to all cases where those failures might happen, not only on the hotplug case. Instead, we use reset-failed before try-restart, which will allow those multiple restarts to happen on a short interval. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@debian.org>
* | | | | Reload kdump when CPU is brought online.Thadeu Lima de Souza Cascardo2020-03-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is needed on ppc64el, as CPUs are not added or removed, but simply brought online. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
* | | | | Use maxcpus instead of nr_cpus on ppc64el.Thadeu Lima de Souza Cascardo2020-03-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On a kdump kernel, nr_cpus is broken, and it will take some time to be properly fixed. In the meantime, we can just use maxcpus. In the worst case, we will get an OOM and reboot instead of panicing too early during boot. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
* | | | | Use a different service for vmcore dump.Thadeu Lima de Souza Cascardo2020-03-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During the dump itself, spurious CPU or memory hotplug events will cause the dump to fail, because the service would be restarted, and an old incomplete dump would prevent the dump from being collected. This work also allows us to stop requiring network during kdump loading, and sets the way to stop requiring network when the dump is not over the network. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@debian.org>
* | | | | Add kdump retry/delay mechanism when dumping over networkGuilherme G. Piccoli2020-03-02
| |/ / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Kdump currently try mounting NFS (or doing the SSH dump) only once, and if it fails, it just gives-up. Since kdump may be essential to debug hard to reproduce bugs, we should improve the resilience and retry a bit, delaying at each attempt. This patch introduces a retry/delay mechanism for both NFS and SSH dumps; the delay time is the same but number of retries is different (since NFS mounts takes a long time between failures and is inherently more resilient), both being configurable parameters from /etc. The original trigger of this issue is a long-term (bad) behavior of some NICs, which present a "Link Up" status _before_ being ready to transmit packets; hence network kdump will try and fail without this patch. Signed-off-by: Guilherme G. Piccoli <gpiccoli@canonical.com> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@debian.org>
* | | | Release 1.6.6-1 to unstableThadeu Lima de Souza Cascardo2019-07-12
| | | | | | | | | | | | | | | | Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@debian.org>
* | | | Update changelog for new upstream 1.6.6Thadeu Lima de Souza Cascardo2019-07-12
| | | | | | | | | | | | | | | | [git-debrebase changelog: new upstream 1.6.6]
* | | | Update to upstream 1.6.6Thadeu Lima de Souza Cascardo2019-07-12
|\| | | | | | | | | | | | | | | [git-debrebase anchor: new upstream 1.6.6, merge]
| * | | [v1.6.6] Update versionKazuhito Hagio2019-06-27
| | | | | | | | | | | | | | | | | | | | | | | | Update makedumpfile to version 1.6.6. Signed-off-by: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
| * | | [PATCH] Support newer kernels up to v5.1Kazuhito Hagio2019-06-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A new makedumpfile supports newer kernels: - 4.20, 5.0, 5.1 (x86 FLATMEM) - 4.20, 5.0, 5.1 (x86 SPARSEMEM) - 4.20, 5.0, 5.1 (x86_64 SPARSEMEM) Signed-off-by: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
| * | | [PATCH] x86_64: fix get_kaslr_offset_x86_64() to return kaslr_offset correctlyKazuhito Hagio2019-05-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, the get_kaslr_offset_x86_64() function has the following condition to return info->kaslr_offset, but it is wrong, and it can return 0 for kernel text addresses if info->kaslr_offset is small. if (vaddr >= __START_KERNEL_map && vaddr < __START_KERNEL_map + info->kaslr_offset) Consequently, kernel text symbols in erase config could be resolved wrongly, and makedumpfile fails to vtop with the following message or erases unintended data. __vtop4_x86_64: Can't get a valid pmd_pte. To fix this, use NUMBER(KERNEL_IMAGE_SIZE) in vmcoreinfo if any, otherwise use the hard-coded value (1 GiB) for KASLR, which has not been changed from the initial KASLR implementation. Signed-off-by: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
| * | | [PATCH] x86_64: Add support for AMD Secure Memory EncryptionLianbo Jiang2019-03-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On AMD machine with Secure Memory Encryption (SME) feature, if SME is enabled, page tables contain a specific attribute bit (C-bit) in their entries to indicate whether a page is encrypted or unencrypted. So get NUMBER(sme_mask) from vmcoreinfo, which stores the value of the C-bit position, and drop it to obtain the true physical address. Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
| * | | [PATCH] exclude pages that are logically offlineDavid Hildenbrand2019-03-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Linux marks pages that are logically offline via a page flag (map count). Such pages e.g. include pages infated as part of a balloon driver or pages that were not actually onlined when onlining the whole section. While the hypervisor usually allows to read such inflated memory, we basically read and dump data that is completely irrelevant. Also, this might result in quite some overhead in the hypervisor. In addition, we saw some problems under Hyper-V, whereby we can crash the kernel by dumping, when reading memory of a partially onlined memory segment (for memory added by the Hyper-V balloon driver). Therefore, don't read and dump pages that are marked as being logically offline. Signed-off-by: David Hildenbrand <david@redhat.com>
| * | | [PATCH] ppc64: fix a typo for checking the file pointer for nullNisha Parrakat2019-02-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Static code analysis of makedumpfile code shows a mistake in checking the validity of a file descripter just attempted to open. arch/ppc64.c: fixed the typo that missed checking fpb that was last attempted to open. Found during cppcheck on the code. Signed-off-by: Nisha Parrakat <Nisha.Parrakat@kpit.com>
| * | | [PATCH v2] honor the CFLAGS from environment variablesKairui Song2019-01-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This makes it possible to pass in extra cflags, for example, hardening flags could be passed in with environment variable when building a hardened package. Also introduce a CFLAGS_BASE to hold common CFLAGS, which simplify the CFLAGS definition. Suggested-by: Kazuhito Hagio <k-hagio@ab.jp.nec.com> Signed-off-by: Kairui Song <kasong@redhat.com>
| * | | [PATCH] Some improvements of debugging messagesKazuhito Hagio2018-12-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - x86_64: Add info->phys_base value - Add VMCOREINFO data - Change the output formats of PT_LOAD and mem_map to make them shorter and more readable - Remove some extra new lines Signed-off-by: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
| * | | [PATCH] ppc64: increase MAX_PHYSMEM_BITS to 2PBHari Bathini2018-12-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Required for kernel 4.20 With kernel commit 4ffe713b7587 ("powerpc/mm: Increase the max addressable memory to 2PB"), MAX_PHYSMEM_BITS is bumped up to 51 for SPARSEMEM_VMEMMAP and SPARSEMEM_EXTREME case. Make the appropriate update here. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
| * | | [v1.6.5] Update versionKazuhito Hagio2018-12-04
| | | | | | | | | | | | | | | | | | | | | | | | This patch updates makedumpfile to version 1.6.5. Signed-off-by: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
| * | | [PATCH] Support newer kernelsKazuhito Hagio2018-11-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A new makedumpfile supports newer kernels: - 4.19 (x86 FLATMEM) - 4.19 (x86 SPARSEMEM) - 4.19 (x86_64 SPARSEMEM) Signed-off-by: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
| * | | [PATCH] x86_64: fix failure of getting kcore vmcoreinfo on kernel 4.19Kazuhito Hagio2018-11-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Required for kernel 4.19 kernel commit 6855dc41b246 ("x86: Add entry trampolines to kcore") added program headers for PTI entry trampoline pages to /proc/kcore. (Later commit bf904d2762ee ("x86/pti/64: Remove the SYSCALL64 entry trampoline") removed them.) This caused the failure of makedumpfile --mem-usage due to wrong calculation of info->page_offset. # makedumpfile --mem-usage /proc/kcore [...] set_kcore_vmcoreinfo: Can't get the offset of VMCOREINFO(/proc/kcore). Success makedumpfile Failed. Since program headers for direct maps are located after the same, with this patch, we select the last valid one to set page_offset. Also, this patch adds a few debug messages for better debugging. Signed-off-by: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
| * | | [PATCH] x86_64: fix an unnecessary message with --mem-usage optionKazuhito Hagio2018-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit bc8b3bbf ("arm64: restore info->page_offset and implement paddr_to_vaddr_arm64()") added get_phys_base() to show_mem_usage(), but at this time there is no vmcoreinfo yet, so get_phys_base_x86_64() executes SYMBOL_INIT() and prints the following message. # makedumpfile --mem-usage /proc/kcore init_dwarf_info: Can't find absolute path to debuginfo file. ... This patch adds the check if vmlinux is specified, and suppress the message. Signed-off-by: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
| * | | [PATCH] arm64: restore info->page_offset and implement paddr_to_vaddr_arm64()Kazuhito Hagio2018-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 94c97db3 (arm64: Get 'info->page_offset' from PT_LOAD segments to support KASLR boot cases) added a method to determine info->page_offset (PAGE_OFFSET) from PT_LOAD segments for arm64 platforms to support --mem-usage option, but its hardcoded condition did not work correctly on several systems. This patch restores the method to determine PAGE_OFFSET value, which is same as kernel's definition, and determine info->phys_offset from PT_LOAD by using PAGE_OFFSET. With these two values, implement paddr_to_vaddr_arm64() to support --mem-usage option. Tested-by: Bhupesh Sharma <bhsharma@redhat.com> Signed-off-by: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
| * | | [PATCH] Prepare paddr_to_vaddr() for arch-specific p2v conversionKazuhito Hagio2018-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, conversion from physical address to virtual address in --mem-usage option is "paddr + PAGE_OFFSET", which was written for x86_64, but it's not suitable especially for arm64. This patch introduces paddr_to_vaddr() macro to get prepared for arch-specific physical to virtual conversion. Tested-by: Bhupesh Sharma <bhsharma@redhat.com> Signed-off-by: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
| * | | makedumpfile: sadump: fix failure of reading 640 KB backup region if at over ↵Hatayama, Daisuke2018-10-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 4GB location Currently, in function sadump_kdump_backup_region_init(), variable mem holding physical memory to read as a candidate of the ELF core header is of type unsigned int with just 4 byte length: for (i = 0; i < ARRAY_LENGTH(kimage.segment); ++i) { char e_ident[EI_NIDENT]; unsigned mem; mem=ULONG(buf+i*SIZE(kexec_segment)+OFFSET(kexec_segment.mem)); if (!mem) continue; if (!readmem(PADDR, mem, e_ident, SELFMAG)) { DEBUG_MSG("sadump: failed to read elfcorehdr buffer\n"); return; } Thus, if backup region for the first 640KB physical memory is located at over 4GB location thanks to crashkernel=size,high like: # grep crashkernel /proc/cmdline BOOT_IMAGE=(hd0,gpt2)/vmlinuz-4.18 root=/dev/mapper/rhel-root ro crashkernel=512M,high # grep Crash /proc/iomem 06000000-15ffffff : Crash kernel 107f000000-109effffff : Crash kernel crash> rd -p 0x109ef5d000 109ef5d000: 00010102464c457f .ELF.... the upper 32-bit of the physical address in mem variable is dropped and readmem() fails while outputting the following debug message: # LANG=C ./makedumpfile --message-level 8 -f -l -d 31 -x ./vmlinux /dev/sdc vmcore-ld31 sadump: read dump device as single partition sadump: single partition configuration page_size : 4096 sadump: timezone information is missing sadump: idtr=fffffe0000000000 sadump: cr3=86b42e000 sadump: idtr(phys)=4c35cc000 sadump: devide_error(vmlinux)=ffffffff81a00c50 sadump: devide_error(vmcore)=ffffffffa0c00c50 sadump: cmdline vaddr: ffffffffa1bcf008 sadump: cmdline paddr: 4c35cf008 sadump: cmdline buf vaddr: ffff8ae89ffceec0 sadump: cmdline buf paddr: 109ffceec0 sadump: kaslr_offset=1f200000 sadump: phys_base=4a1a00000 sadump: online cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 [...] sadump: nr_cpus: 60 sadump: failed to read elfcorehdr buffer <--- This is the debug message indicating reading ELF core header fails Then, the generated vmcore has invalid data in its first 640KB part. The variable mem needs to have type of 64-bit length. With this patch, kdump backup region is successfully found as follows: # LANG=C ./makedumpfile --message-level 31 -f -l -d 31 -x ./vmlinux /dev/sdc vmcore-ld31 sadump: read dump device as single partition sadump: single partition configuration page_size : 4096 sadump: timezone information is missing sadump: idtr=fffffe0000000000 sadump: cr3=86b42e000 sadump: idtr(phys)=4c35cc000 sadump: devide_error(vmlinux)=ffffffff81a00c50 sadump: devide_error(vmcore)=ffffffffa0c00c50 sadump: cmdline vaddr: ffffffffa1bcf008 sadump: cmdline paddr: 4c35cf008 sadump: cmdline buf vaddr: ffff8ae89ffceec0 sadump: cmdline buf paddr: 109ffceec0 sadump: kaslr_offset=1f200000 sadump: phys_base=4a1a00000 sadump: online cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 [...] sadump: nr_cpus: 60 The kernel version is not supported. The makedumpfile operation may be incomplete. sadump: SRC_START: 0x00000000001000 SRC_SIZE: 0x0000000009f000 SRC_OFFSET: 0x0000109ef61000 sadump: kdump backup region used ...<snip>... By the way, before crashkernel=size,high was introduced, there was limitation that ELF core header resides at under 4GB location, so defining it as unsigned int was not entirely wrong at that time. Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
| * | | [PATCH] Update help text to indicate --mem-usage is supported on archs other ↵Bhupesh Sharma2018-10-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | than x86_64 Commit 8449bda73ab14516d4bf81d29503c1ea203bb865 ("Documentation: Update documentation regarding --mem-usage' option"), updated the makedumpfile man page to indicate that this option is now supported on x86_64, arm64, ppc64 and s390x. However, the help text for makedumpfile (which one can see by running 'makedumpfile --help'), still reflects the old support status, i.e. this option is only supported on x86_64. This patch changes the help text to be in sync with the man page text. Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com>
| * | | [PATCH] Fix failure of detection of SPARSEMEM EXTREME in case of -x VMLINUXHatayama, Daisuke2018-10-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This issue was introduced by commit f3c87e0ab1f62b118e738d046c3d676325770418. Currently, is_sparsemem_extreme() compares ARRAY_LENGTH(mem_section) with NOT_FOUND_SYMBOL but correct initial value for array table is NOT_FOUND_STRUCTURE. As a result, makedumpfile fails to detect SPARSEMEM EXTREME and so fails to convert vmcore captured by sadump as follows: # LANG=C makedumpfile --message-level 31 -f -l -d 31 -x ./vmlinux /dev/sdc vmcore-ld31 sadump: read dump device as single partition sadump: single partition configuration page_size : 4096 sadump: timezone information is missing sadump: idtr=fffffe0000000000 sadump: cr3=ba4e0a000 sadump: idtr(phys)=ba55cc000 sadump: devide_error(vmlinux)=ffffffff81a00c50 sadump: devide_error(vmcore)=ffffffff83c00c50 sadump: cmdline vaddr: ffffffff84bcf008 sadump: cmdline paddr: ba55cf008 sadump: cmdline buf vaddr: ffff8fa39ffceec0 sadump: cmdline buf paddr: 109ffceec0 sadump: kaslr_offset=2200000 sadump: phys_base=ba0a00000 sadump: online cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 [...] sadump: nr_cpus: 60 The kernel version is not supported. The makedumpfile operation may be incomplete. sadump: SRC_START: 0x00000000001000 SRC_SIZE: 0x0000000009f000 SRC_OFFSET: 0x00000025f61000 sadump: kdump backup region unused num of NODEs : 2 Memory type : SPARSEMEM get_mm_sparsemem: Can't get the address of mem_section. makedumpfile Failed. This issue doesn't occur for vmcore captured by kdump because in that case, the length of mem_section is provided via VMCOREINFO and is_sparsemem_extreme() returns TRUE via the other path. This issue occurs also for other mechanisms where we need to use -x VMLINUX such as virsh dump. Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>