commit 6371f030c4dc8b69140819e92803aae7e6039cd6
Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date:   Wed Aug 30 10:32:30 2017 +0200

    Linux 4.12.10

commit 849e96758ab22c0a9f307095f3950ca34b1def4e
Author: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Date:   Mon Jul 24 14:28:00 2017 +1000

    powerpc/mm: Ensure cpumask update is ordered
    
    commit 1a92a80ad386a1a6e3b36d576d52a1a456394b70 upstream.
    
    There is no guarantee that the various isync's involved with
    the context switch will order the update of the CPU mask with
    the first TLB entry for the new context being loaded by the HW.
    
    Be safe here and add a memory barrier to order any subsequent
    load/store which may bring entries into the TLB.
    
    The corresponding barrier on the other side already exists as
    pte updates use pte_xchg() which uses __cmpxchg_u64 which has
    a sync after the atomic operation.
    
    Cc: stable@vger.kernel.org
    Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
    Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
    [mpe: Add comments in the code]
    [mpe: Backport to 4.12, minor context change]
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 53220a20cec09d217de22ed3580c7d4c51972b85
Author: Lv Zheng <lv.zheng@intel.com>
Date:   Wed Aug 16 15:29:49 2017 +0800

    ACPI: EC: Fix regression related to wrong ECDT initialization order
    
    commit 98529b9272e06a7767034fb8a32e43cdecda240a upstream.
    
    Commit 2a5708409e4e (ACPI / EC: Fix a gap that ECDT EC cannot handle
    EC events) introduced acpi_ec_ecdt_start(), but that function is
    invoked before acpi_ec_query_init(), which is too early.  This causes
    the kernel to crash if an EC event occurs after boot, when ec_query_wq
    is not valid:
    
     BUG: unable to handle kernel NULL pointer dereference at 0000000000000102
     ...
     Workqueue: events acpi_ec_event_handler
     task: ffff9f539790dac0 task.stack: ffffb437c0e10000
     RIP: 0010:__queue_work+0x32/0x430
    
    Normally, the DSDT EC should always be valid, so acpi_ec_ecdt_start()
    is actually a no-op in the majority of cases.  However, commit
    c712bb58d827 (ACPI / EC: Add support to skip boot stage DSDT probe)
    caused the probing of the DSDT EC as the "boot EC" to be skipped when
    the ECDT EC is valid and uncovered the bug.
    
    Fix this issue by invoking acpi_ec_ecdt_start() after acpi_ec_query_init()
    in acpi_ec_init().
    
    Link: https://jira01.devtools.intel.com/browse/LCK-4348
    Fixes: 2a5708409e4e (ACPI / EC: Fix a gap that ECDT EC cannot handle EC events)
    Fixes: c712bb58d827 (ACPI / EC: Add support to skip boot stage DSDT probe)
    Reported-by: Wang Wendy <wendy.wang@intel.com>
    Tested-by: Feng Chenzhou <chenzhoux.feng@intel.com>
    Signed-off-by: Lv Zheng <lv.zheng@intel.com>
    [ rjw: Changelog ]
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 6e80b88a7f7dffa0340236ca420357295319c150
Author: Hanjun Guo <hanjun.guo@linaro.org>
Date:   Fri Jul 28 17:42:35 2017 +0800

    ACPI: APD: Fix HID for Hisilicon Hip07/08
    
    commit f7f3dd5b4cbb138ed4559b0d096bab76a8f476de upstream.
    
    ACPI HID for Hisilicon Hip07/08 should be HISI02A1/2,
    not HISI0A21/2, HISI02A1/2 was tested ok but was modified
    by the stupid typo when upstream the patches (by me),
    correct them to the right IDs (matching the IDs in
    drivers/i2c/busses/i2c-designware-platdrv.c).
    
    Fixes: 6e14cf361a0c (ACPI / APD: Add clock frequency for Hisilicon Hip07/08 I2C controller)
    Reported-by: Tao Tian <tiantao6@huawei.com>
    Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 49fa8c02e4a6359f5c13ce14ee9a79d53e1865ff
Author: Dave Jiang <dave.jiang@intel.com>
Date:   Fri Jul 28 15:10:48 2017 -0700

    ntb: transport shouldn't disable link due to bogus values in SPADs
    
    commit f3fd2afed8eee91620d05b69ab94c14793c849d7 upstream.
    
    It seems that under certain scenarios the SPAD can have bogus values caused
    by an agent (i.e. BIOS or other software) that is not the kernel driver, and
    that causes memory window setup failure. This should not cause the link to
    be disabled because if we do that, the driver will never recover again. We
    have verified in testing that this issue happens and prevents proper link
    recovery.
    
    Signed-off-by: Dave Jiang <dave.jiang@intel.com>
    Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
    Signed-off-by: Jon Mason <jdmason@kudzu.us>
    Fixes: 84f766855f61 ("ntb: stop link work when we do not have memory")
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit ab75f0274d344a7a972ae67a283dea9fe8291c87
Author: Logan Gunthorpe <logang@deltatee.com>
Date:   Tue Jul 25 14:57:42 2017 -0600

    ntb: ntb_test: ensure the link is up before trying to configure the mws
    
    commit 0eb46345364d7318b11068c46e8a68d5dc10f65e upstream.
    
    After the link tests, there is a race on one side of the test for
    the link coming up. It's possible, in some cases, for the test script
    to write to the 'peer_trans' files before the link has come up.
    
    To fix this, we simply use the link event file to ensure both sides
    see the link as up before continuning.
    
    Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
    Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
    Signed-off-by: Jon Mason <jdmason@kudzu.us>
    Fixes: a9c59ef77458 ("ntb_test: Add a selftest script for the NTB subsystem")
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 03e58884668e2eb970e2dd09ae9ff48c932b7695
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Sun Aug 27 12:12:25 2017 -0700

    Clarify (and fix) MAX_LFS_FILESIZE macros
    
    commit 0cc3b0ec23ce4c69e1e890ed2b8d2fa932b14aad upstream.
    
    We have a MAX_LFS_FILESIZE macro that is meant to be filled in by
    filesystems (and other IO targets) that know they are 64-bit clean and
    don't have any 32-bit limits in their IO path.
    
    It turns out that our 32-bit value for that limit was bogus.  On 32-bit,
    the VM layer is limited by the page cache to only 32-bit index values,
    but our logic for that was confusing and actually wrong.  We used to
    define that value to
    
            (((loff_t)PAGE_SIZE << (BITS_PER_LONG-1))-1)
    
    which is actually odd in several ways: it limits the index to 31 bits,
    and then it limits files so that they can't have data in that last byte
    of a page that has the highest 31-bit index (ie page index 0x7fffffff).
    
    Neither of those limitations make sense.  The index is actually the full
    32 bit unsigned value, and we can use that whole full page.  So the
    maximum size of the file would logically be "PAGE_SIZE << BITS_PER_LONG".
    
    However, we do wan tto avoid the maximum index, because we have code
    that iterates over the page indexes, and we don't want that code to
    overflow.  So the maximum size of a file on a 32-bit host should
    actually be one page less than the full 32-bit index.
    
    So the actual limit is ULONG_MAX << PAGE_SHIFT.  That means that we will
    not actually be using the page of that last index (ULONG_MAX), but we
    can grow a file up to that limit.
    
    The wrong value of MAX_LFS_FILESIZE actually caused problems for Doug
    Nazar, who was still using a 32-bit host, but with a 9.7TB 2 x RAID5
    volume.  It turns out that our old MAX_LFS_FILESIZE was 8TiB (well, one
    byte less), but the actual true VM limit is one page less than 16TiB.
    
    This was invisible until commit c2a9737f45e2 ("vfs,mm: fix a dead loop
    in truncate_inode_pages_range()"), which started applying that
    MAX_LFS_FILESIZE limit to block devices too.
    
    NOTE! On 64-bit, the page index isn't a limiter at all, and the limit is
    actually just the offset type itself (loff_t), which is signed.  But for
    clarity, on 64-bit, just use the maximum signed value, and don't make
    people have to count the number of 'f' characters in the hex constant.
    
    So just use LLONG_MAX for the 64-bit case.  That was what the value had
    been before too, just written out as a hex constant.
    
    Fixes: c2a9737f45e2 ("vfs,mm: fix a dead loop in truncate_inode_pages_range()")
    Reported-and-tested-by: Doug Nazar <nazard@nazar.ca>
    Cc: Andreas Dilger <adilger@dilger.ca>
    Cc: Mark Fasheh <mfasheh@versity.com>
    Cc: Joel Becker <jlbec@evilplan.org>
    Cc: Dave Kleikamp <shaggy@kernel.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 0b9a3f300f89c47ffec8c2eea474d33beaebb0ac
Author: Joerg Roedel <jroedel@suse.de>
Date:   Mon Aug 14 17:19:26 2017 +0200

    iommu: Fix wrong freeing of iommu_device->dev
    
    commit 2926a2aa5c14fb2add75e6584845b1c03022235f upstream.
    
    The struct iommu_device has a 'struct device' embedded into
    it, not as a pointer, but the whole struct. In the
    conversion of the iommu drivers to use struct iommu_device
    it was forgotten that the relase function for that struct
    device simply calls kfree() on the pointer.
    
    This frees memory that was never allocated and causes memory
    corruption.
    
    To fix this issue, use a pointer to struct device instead of
    embedding the whole struct. This needs some updates in the
    iommu sysfs code as well as the Intel VT-d and AMD IOMMU
    driver.
    
    Reported-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
    Fixes: 39ab9555c241 ('iommu: Add sysfs bindings for struct iommu_device')
    Signed-off-by: Joerg Roedel <jroedel@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 75005bf89ad7b63bdc66d088070aaff71f14b161
Author: Charles Milette <charlesmilette@gmail.com>
Date:   Fri Aug 18 16:30:34 2017 -0400

    staging: rtl8188eu: add RNX-N150NUB support
    
    commit f299aec6ebd747298e35934cff7709c6b119ca52 upstream.
    
    Add support for USB Device Rosewill RNX-N150NUB.
    VendorID: 0x0bda, ProductID: 0xffef
    
    Signed-off-by: Charles Milette <charles.milette@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 91628e2afc865de2c8d288ec8ba117d872a40856
Author: Lorenzo Bianconi <lorenzo.bianconi83@gmail.com>
Date:   Wed Aug 16 19:02:51 2017 +0200

    iio: magnetometer: st_magn: remove ihl property for LSM303AGR
    
    commit 8b35a5f87a73842601cd376e0f5b9b25831390f4 upstream.
    
    Remove IRQ active low support for LSM303AGR since the sensor does not
    support that capability for data-ready line
    
    Fixes: a9fd053b56c6 (iio: st_sensors: support active-low interrupts)
    Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@st.com>
    Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
    Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit e59c095c11af506fa98a92957ac669080cbd0e04
Author: Lorenzo Bianconi <lorenzo.bianconi83@gmail.com>
Date:   Wed Aug 16 19:02:50 2017 +0200

    iio: magnetometer: st_magn: fix status register address for LSM303AGR
    
    commit 541ee9b24fca587f510fe1bc58508d5cf40707af upstream.
    
    Fixes: 97865fe41322 (iio: st_sensors: verify interrupt event to status)
    Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@st.com>
    Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
    Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit fc7957b6cdd787e70923dfce250f0564590fe922
Author: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Date:   Sat Aug 12 09:09:21 2017 -0700

    iio: hid-sensor-trigger: Fix the race with user space powering up sensors
    
    commit f1664eaacec31035450132c46ed2915fd2b2049a upstream.
    
    It has been reported for a while that with iio-sensor-proxy service the
    rotation only works after one suspend/resume cycle. This required a wait
    in the systemd unit file to avoid race. I found a Yoga 900 where I could
    reproduce this.
    
    The problem scenerio is:
    - During sensor driver init, enable run time PM and also set a
      auto-suspend for 3 seconds.
            This result in one runtime resume. But there is a check to avoid
    a powerup in this sequence, but rpm is active
    - User space iio-sensor-proxy tries to power up the sensor. Since rpm is
      active it will simply return. But sensors were not actually
    powered up in the prior sequence, so actaully the sensors will not work
    - After 3 seconds the auto suspend kicks
    
    If we add a wait in systemd service file to fire iio-sensor-proxy after
    3 seconds, then now everything will work as the runtime resume will
    actually powerup the sensor as this is a user request.
    
    To avoid this:
    - Remove the check to match user requested state, this will cause a
      brief powerup, but if the iio-sensor-proxy starts immediately it will
    still work as the sensors are ON.
    - Also move the autosuspend delay to place when user requested turn off
      of sensors, like after user finished raw read or buffer disable
    
    Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
    Tested-by: Bastien Nocera <hadess@hadess.net>
    Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit a1d7b7e7e11617d7a6727fda34279a62a9be8d04
Author: Dragos Bogdan <dragos.bogdan@analog.com>
Date:   Fri Aug 4 01:37:27 2017 +0300

    iio: imu: adis16480: Fix acceleration scale factor for adis16480
    
    commit fdd0d32eb95f135041236a6885d9006315aa9a1d upstream.
    
    According to the datasheet, the range of the acceleration is [-10 g, + 10 g],
    so the scale factor should be 10 instead of 5.
    
    Signed-off-by: Dragos Bogdan <dragos.bogdan@analog.com>
    Acked-by: Lars-Peter Clausen <lars@metafoo.de>
    Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit bf9b9d3b382b51b7a43ebddabd07ec7034772d21
Author: Martijn Coenen <maco@android.com>
Date:   Fri Jul 28 13:56:08 2017 +0200

    ANDROID: binder: fix proc->tsk check.
    
    commit b2a6d1b999a4c13e5997bb864694e77172d45250 upstream.
    
    Commit c4ea41ba195d ("binder: use group leader instead of open thread")'
    was incomplete and didn't update a check in binder_mmap(), causing all
    mmap() calls into the binder driver to fail.
    
    Signed-off-by: Martijn Coenen <maco@android.com>
    Tested-by: John Stultz <john.stultz@linaro.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit f6fc60d915497c14cd58e06d857533220645c916
Author: Riley Andrews <riandrews@google.com>
Date:   Thu Jun 29 12:01:37 2017 -0700

    binder: Use wake up hint for synchronous transactions.
    
    commit 00b40d613352c623aaae88a44e5ded7c912909d7 upstream.
    
    Use wake_up_interruptible_sync() to hint to the scheduler binder
    transactions are synchronous wakeups. Disable preemption while waking
    to avoid ping-ponging on the binder lock.
    
    Signed-off-by: Todd Kjos <tkjos@google.com>
    Signed-off-by: Omprakash Dhyade <odhyade@codeaurora.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 7771e3f4b0b9e2bc289812de2dc18d219d56788d
Author: Todd Kjos <tkjos@android.com>
Date:   Thu Jun 29 12:01:36 2017 -0700

    binder: use group leader instead of open thread
    
    commit c4ea41ba195d01c9af66fb28711a16cc97caa9c5 upstream.
    
    The binder allocator assumes that the thread that
    called binder_open will never die for the lifetime of
    that proc. That thread is normally the group_leader,
    however it may not be. Use the group_leader instead
    of current.
    
    Signed-off-by: Todd Kjos <tkjos@google.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 62ccb816aab8d1b89d983bb669982bebb9b258fb
Author: Todd Kjos <tkjos@android.com>
Date:   Wed Jul 5 13:46:01 2017 -0700

    Revert "android: binder: Sanity check at binder ioctl"
    
    commit a2b18708ee14baec4ef9c0fba96070bba14d0081 upstream.
    
    This reverts commit a906d6931f3ccaf7de805643190765ddd7378e27.
    
    The patch introduced a race in the binder driver. An attempt to fix the
    race was submitted in "[PATCH v2] android: binder: fix dangling pointer
    comparison", however the conclusion in the discussion for that patch
    was that the original patch should be reverted.
    
    The reversion is being done as part of the fine-grained locking
    patchset since the patch would need to be refactored when
    proc->vmm_vm_mm is removed from struct binder_proc and added
    in the binder allocator.
    
    Signed-off-by: Todd Kjos <tkjos@google.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit b42c44ade798e4b6358073294ae045ef0f1c84d6
Author: Jeffy Chen <jeffy.chen@rock-chips.com>
Date:   Tue Jun 27 17:34:42 2017 +0800

    Bluetooth: bnep: fix possible might sleep error in bnep_session
    
    commit 25717382c1dd0ddced2059053e3ca5088665f7a5 upstream.
    
    It looks like bnep_session has same pattern as the issue reported in
    old rfcomm:
    
            while (1) {
                    set_current_state(TASK_INTERRUPTIBLE);
                    if (condition)
                            break;
                    // may call might_sleep here
                    schedule();
            }
            __set_current_state(TASK_RUNNING);
    
    Which fixed at:
            dfb2fae Bluetooth: Fix nested sleeps
    
    So let's fix it at the same way, also follow the suggestion of:
    https://lwn.net/Articles/628628/
    
    Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com>
    Reviewed-by: Brian Norris <briannorris@chromium.org>
    Reviewed-by: AL Yu-Chen Cho <acho@suse.com>
    Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
    Cc: Jiri Slaby <jslaby@suse.cz>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit b741896229c17d9fa058ca005e81dd53717acc8a
Author: Jeffy Chen <jeffy.chen@rock-chips.com>
Date:   Tue Jun 27 17:34:43 2017 +0800

    Bluetooth: cmtp: fix possible might sleep error in cmtp_session
    
    commit f06d977309d09253c744e54e75c5295ecc52b7b4 upstream.
    
    It looks like cmtp_session has same pattern as the issue reported in
    old rfcomm:
    
            while (1) {
                    set_current_state(TASK_INTERRUPTIBLE);
                    if (condition)
                            break;
                    // may call might_sleep here
                    schedule();
            }
            __set_current_state(TASK_RUNNING);
    
    Which fixed at:
            dfb2fae Bluetooth: Fix nested sleeps
    
    So let's fix it at the same way, also follow the suggestion of:
    https://lwn.net/Articles/628628/
    
    Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com>
    Reviewed-by: Brian Norris <briannorris@chromium.org>
    Reviewed-by: AL Yu-Chen Cho <acho@suse.com>
    Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
    Cc: Jiri Slaby <jslaby@suse.cz>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit e792d2d48928acd0f8606585a4359bbde90c0654
Author: Jeffy Chen <jeffy.chen@rock-chips.com>
Date:   Tue Jun 27 17:34:44 2017 +0800

    Bluetooth: hidp: fix possible might sleep error in hidp_session_thread
    
    commit 5da8e47d849d3d37b14129f038782a095b9ad049 upstream.
    
    It looks like hidp_session_thread has same pattern as the issue reported in
    old rfcomm:
    
            while (1) {
                    set_current_state(TASK_INTERRUPTIBLE);
                    if (condition)
                            break;
                    // may call might_sleep here
                    schedule();
            }
            __set_current_state(TASK_RUNNING);
    
    Which fixed at:
            dfb2fae Bluetooth: Fix nested sleeps
    
    So let's fix it at the same way, also follow the suggestion of:
    https://lwn.net/Articles/628628/
    
    Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com>
    Tested-by: AL Yu-Chen Cho <acho@suse.com>
    Tested-by: Rohit Vaswani <rvaswani@nvidia.com>
    Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
    Cc: Jiri Slaby <jslaby@suse.cz>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 1eb33a1b89e107d103d43d4ee757e31cfd5c832c
Author: Mateusz Jurczyk <mjurczyk@google.com>
Date:   Wed Jun 7 15:50:38 2017 +0200

    netfilter: nfnetlink: Improve input length sanitization in nfnetlink_rcv
    
    commit f55ce7b024090a51382ccab2730b96e2f7b4e9cf upstream.
    
    Verify that the length of the socket buffer is sufficient to cover the
    nlmsghdr structure before accessing the nlh->nlmsg_len field for further
    input sanitization. If the client only supplies 1-3 bytes of data in
    sk_buff, then nlh->nlmsg_len remains partially uninitialized and
    contains leftover memory from the corresponding kernel allocation.
    Operating on such data may result in indeterminate evaluation of the
    nlmsg_len < NLMSG_HDRLEN expression.
    
    The bug was discovered by a runtime instrumentation designed to detect
    use of uninitialized memory in the kernel. The patch prevents this and
    other similar tools (e.g. KMSAN) from flagging this behavior in the future.
    
    Signed-off-by: Mateusz Jurczyk <mjurczyk@google.com>
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Cc: Florian Westphal <fw@strlen.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 8b5041077024eca607ae21d90186110452241668
Author: Florian Westphal <fw@strlen.de>
Date:   Fri Jul 7 13:07:17 2017 +0200

    netfilter: nat: fix src map lookup
    
    commit 97772bcd56efa21d9d8976db6f205574ea602f51 upstream.
    
    When doing initial conversion to rhashtable I replaced the bucket
    walk with a single rhashtable_lookup_fast().
    
    When moving to rhlist I failed to properly walk the list of identical
    tuples, but that is what is needed for this to work correctly.
    The table contains the original tuples, so the reply tuples are all
    distinct.
    
    We currently decide that mapping is (not) in range only based on the
    first entry, but in case its not we need to try the reply tuple of the
    next entry until we either find an in-range mapping or we checked
    all the entries.
    
    This bug makes nat core attempt collision resolution while it might be
    able to use the mapping as-is.
    
    Fixes: 870190a9ec90 ("netfilter: nat: convert nat bysrc hash to rhashtable")
    Reported-by: Jaco Kroon <jaco@uls.co.za>
    Tested-by: Jaco Kroon <jaco@uls.co.za>
    Signed-off-by: Florian Westphal <fw@strlen.de>
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit f5263887165a2f9bad884239e25df00b46091b50
Author: Florian Westphal <fw@strlen.de>
Date:   Mon Jul 10 13:53:53 2017 +0200

    netfilter: expect: fix crash when putting uninited expectation
    
    commit 36ac344e16e04e3e55e8fed7446095a6458c64e6 upstream.
    
    We crash in __nf_ct_expect_check, it calls nf_ct_remove_expect on the
    uninitialised expectation instead of existing one, so del_timer chokes
    on random memory address.
    
    Fixes: ec0e3f01114ad32711243 ("netfilter: nf_ct_expect: Add nf_ct_remove_expect()")
    Reported-by: Sergey Kvachonok <ravenexp@gmail.com>
    Tested-by: Sergey Kvachonok <ravenexp@gmail.com>
    Cc: Gao Feng <fgao@ikuai8.com>
    Signed-off-by: Florian Westphal <fw@strlen.de>
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 4909a7b79965764bd9a1f4c4f9f5000f1e57683a
Author: Vadim Lomovtsev <vlomovts@redhat.com>
Date:   Mon Aug 21 07:23:07 2017 -0400

    net: sunrpc: svcsock: fix NULL-pointer exception
    
    commit eebe53e87f97975ee58a21693e44797608bf679c upstream.
    
    While running nfs/connectathon tests kernel NULL-pointer exception
    has been observed due to races in svcsock.c.
    
    Race is appear when kernel accepts connection by kernel_accept
    (which creates new socket) and start queuing ingress packets
    to new socket. This happens in ksoftirq context which could run
    concurrently on a different core while new socket setup is not done yet.
    
    The fix is to re-order socket user data init sequence and add
    write/read barrier calls to be sure that we got proper values
    for callback pointers before actually calling them.
    
    Test results: nfs/connectathon reports '0' failed tests for about 200+ iterations.
    
    Crash log:
    ---<-snip->---
    [ 6708.638984] Unable to handle kernel NULL pointer dereference at virtual address 00000000
    [ 6708.647093] pgd = ffff0000094e0000
    [ 6708.650497] [00000000] *pgd=0000010ffff90003, *pud=0000010ffff90003, *pmd=0000010ffff80003, *pte=0000000000000000
    [ 6708.660761] Internal error: Oops: 86000005 [#1] SMP
    [ 6708.665630] Modules linked in: nfsv3 nfnetlink_queue nfnetlink_log nfnetlink rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache overlay xt_CONNSECMARK xt_SECMARK xt_conntrack iptable_security ip_tables ah4 xfrm4_mode_transport sctp tun binfmt_misc ext4 jbd2 mbcache loop tcp_diag udp_diag inet_diag rpcrdma ib_isert iscsi_target_mod ib_iser rdma_cm iw_cm libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib ib_ucm ib_uverbs ib_umad ib_cm ib_core nls_koi8_u nls_cp932 ts_kmp nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack vfat fat ghash_ce sha2_ce sha1_ce cavium_rng_vf i2c_thunderx sg thunderx_edac i2c_smbus edac_core cavium_rng nfsd auth_rpcgss nfs_acl lockd grace sunrpc xfs libcrc32c nicvf nicpf ast i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops
    [ 6708.736446]  ttm drm i2c_core thunder_bgx thunder_xcv mdio_thunder mdio_cavium dm_mirror dm_region_hash dm_log dm_mod [last unloaded: stap_3c300909c5b3f46dcacd49aab3334af_87021]
    [ 6708.752275] CPU: 84 PID: 0 Comm: swapper/84 Tainted: G        W  OE   4.11.0-4.el7.aarch64 #1
    [ 6708.760787] Hardware name: www.cavium.com CRB-2S/CRB-2S, BIOS 0.3 Mar 13 2017
    [ 6708.767910] task: ffff810006842e80 task.stack: ffff81000689c000
    [ 6708.773822] PC is at 0x0
    [ 6708.776739] LR is at svc_data_ready+0x38/0x88 [sunrpc]
    [ 6708.781866] pc : [<0000000000000000>] lr : [<ffff0000029d7378>] pstate: 60000145
    [ 6708.789248] sp : ffff810ffbad3900
    [ 6708.792551] x29: ffff810ffbad3900 x28: ffff000008c73d58
    [ 6708.797853] x27: 0000000000000000 x26: ffff81000bbe1e00
    [ 6708.803156] x25: 0000000000000020 x24: ffff800f7410bf28
    [ 6708.808458] x23: ffff000008c63000 x22: ffff000008c63000
    [ 6708.813760] x21: ffff800f7410bf28 x20: ffff81000bbe1e00
    [ 6708.819063] x19: ffff810012412400 x18: 00000000d82a9df2
    [ 6708.824365] x17: 0000000000000000 x16: 0000000000000000
    [ 6708.829667] x15: 0000000000000000 x14: 0000000000000001
    [ 6708.834969] x13: 0000000000000000 x12: 722e736f622e676e
    [ 6708.840271] x11: 00000000f814dd99 x10: 0000000000000000
    [ 6708.845573] x9 : 7374687225000000 x8 : 0000000000000000
    [ 6708.850875] x7 : 0000000000000000 x6 : 0000000000000000
    [ 6708.856177] x5 : 0000000000000028 x4 : 0000000000000000
    [ 6708.861479] x3 : 0000000000000000 x2 : 00000000e5000000
    [ 6708.866781] x1 : 0000000000000000 x0 : ffff81000bbe1e00
    [ 6708.872084]
    [ 6708.873565] Process swapper/84 (pid: 0, stack limit = 0xffff81000689c000)
    [ 6708.880341] Stack: (0xffff810ffbad3900 to 0xffff8100068a0000)
    [ 6708.886075] Call trace:
    [ 6708.888513] Exception stack(0xffff810ffbad3710 to 0xffff810ffbad3840)
    [ 6708.894942] 3700:                                   ffff810012412400 0001000000000000
    [ 6708.902759] 3720: ffff810ffbad3900 0000000000000000 0000000060000145 ffff800f79300000
    [ 6708.910577] 3740: ffff000009274d00 00000000000003ea 0000000000000015 ffff000008c63000
    [ 6708.918395] 3760: ffff810ffbad3830 ffff800f79300000 000000000000004d 0000000000000000
    [ 6708.926212] 3780: ffff810ffbad3890 ffff0000080f88dc ffff800f79300000 000000000000004d
    [ 6708.934030] 37a0: ffff800f7930093c ffff000008c63000 0000000000000000 0000000000000140
    [ 6708.941848] 37c0: ffff000008c2c000 0000000000040b00 ffff81000bbe1e00 0000000000000000
    [ 6708.949665] 37e0: 00000000e5000000 0000000000000000 0000000000000000 0000000000000028
    [ 6708.957483] 3800: 0000000000000000 0000000000000000 0000000000000000 7374687225000000
    [ 6708.965300] 3820: 0000000000000000 00000000f814dd99 722e736f622e676e 0000000000000000
    [ 6708.973117] [<          (null)>]           (null)
    [ 6708.977824] [<ffff0000086f9fa4>] tcp_data_queue+0x754/0xc5c
    [ 6708.983386] [<ffff0000086fa64c>] tcp_rcv_established+0x1a0/0x67c
    [ 6708.989384] [<ffff000008704120>] tcp_v4_do_rcv+0x15c/0x22c
    [ 6708.994858] [<ffff000008707418>] tcp_v4_rcv+0xaf0/0xb58
    [ 6709.000077] [<ffff0000086df784>] ip_local_deliver_finish+0x10c/0x254
    [ 6709.006419] [<ffff0000086dfea4>] ip_local_deliver+0xf0/0xfc
    [ 6709.011980] [<ffff0000086dfad4>] ip_rcv_finish+0x208/0x3a4
    [ 6709.017454] [<ffff0000086e018c>] ip_rcv+0x2dc/0x3c8
    [ 6709.022328] [<ffff000008692fc8>] __netif_receive_skb_core+0x2f8/0xa0c
    [ 6709.028758] [<ffff000008696068>] __netif_receive_skb+0x38/0x84
    [ 6709.034580] [<ffff00000869611c>] netif_receive_skb_internal+0x68/0xdc
    [ 6709.041010] [<ffff000008696bc0>] napi_gro_receive+0xcc/0x1a8
    [ 6709.046690] [<ffff0000014b0fc4>] nicvf_cq_intr_handler+0x59c/0x730 [nicvf]
    [ 6709.053559] [<ffff0000014b1380>] nicvf_poll+0x38/0xb8 [nicvf]
    [ 6709.059295] [<ffff000008697a6c>] net_rx_action+0x2f8/0x464
    [ 6709.064771] [<ffff000008081824>] __do_softirq+0x11c/0x308
    [ 6709.070164] [<ffff0000080d14e4>] irq_exit+0x12c/0x174
    [ 6709.075206] [<ffff00000813101c>] __handle_domain_irq+0x78/0xc4
    [ 6709.081027] [<ffff000008081608>] gic_handle_irq+0x94/0x190
    [ 6709.086501] Exception stack(0xffff81000689fdf0 to 0xffff81000689ff20)
    [ 6709.092929] fde0:                                   0000810ff2ec0000 ffff000008c10000
    [ 6709.100747] fe00: ffff000008c70ef4 0000000000000001 0000000000000000 ffff810ffbad9b18
    [ 6709.108565] fe20: ffff810ffbad9c70 ffff8100169d3800 ffff810006843ab0 ffff81000689fe80
    [ 6709.116382] fe40: 0000000000000bd0 0000ffffdf979cd0 183f5913da192500 0000ffff8a254ce4
    [ 6709.124200] fe60: 0000ffff8a254b78 0000aaab10339808 0000000000000000 0000ffff8a0c2a50
    [ 6709.132018] fe80: 0000ffffdf979b10 ffff000008d6d450 ffff000008c10000 ffff000008d6d000
    [ 6709.139836] fea0: 0000000000000054 ffff000008cd3dbc 0000000000000000 0000000000000000
    [ 6709.147653] fec0: 0000000000000000 0000000000000000 0000000000000000 ffff81000689ff20
    [ 6709.155471] fee0: ffff000008085240 ffff81000689ff20 ffff000008085244 0000000060000145
    [ 6709.163289] ff00: ffff81000689ff10 ffff00000813f1e4 ffffffffffffffff ffff00000813f238
    [ 6709.171107] [<ffff000008082eb4>] el1_irq+0xb4/0x140
    [ 6709.175976] [<ffff000008085244>] arch_cpu_idle+0x44/0x11c
    [ 6709.181368] [<ffff0000087bf3b8>] default_idle_call+0x20/0x30
    [ 6709.187020] [<ffff000008116d50>] do_idle+0x158/0x1e4
    [ 6709.191973] [<ffff000008116ff4>] cpu_startup_entry+0x2c/0x30
    [ 6709.197624] [<ffff00000808e7cc>] secondary_start_kernel+0x13c/0x160
    [ 6709.203878] [<0000000001bc71c4>] 0x1bc71c4
    [ 6709.207967] Code: bad PC value
    [ 6709.211061] SMP: stopping secondary CPUs
    [ 6709.218830] Starting crashdump kernel...
    [ 6709.222749] Bye!
    ---<-snip>---
    
    Signed-off-by: Vadim Lomovtsev <vlomovts@redhat.com>
    Reviewed-by: Jeff Layton <jlayton@redhat.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit a8da876c1e45b75c082a5dc8ce10c0761a10c638
Author: Eric Biggers <ebiggers@google.com>
Date:   Thu Aug 24 10:50:29 2017 -0700

    x86/mm: Fix use-after-free of ldt_struct
    
    commit ccd5b3235180eef3cfec337df1c8554ab151b5cc upstream.
    
    The following commit:
    
      39a0526fb3f7 ("x86/mm: Factor out LDT init from context init")
    
    renamed init_new_context() to init_new_context_ldt() and added a new
    init_new_context() which calls init_new_context_ldt().  However, the
    error code of init_new_context_ldt() was ignored.  Consequently, if a
    memory allocation in alloc_ldt_struct() failed during a fork(), the
    ->context.ldt of the new task remained the same as that of the old task
    (due to the memcpy() in dup_mm()).  ldt_struct's are not intended to be
    shared, so a use-after-free occurred after one task exited.
    
    Fix the bug by making init_new_context() pass through the error code of
    init_new_context_ldt().
    
    This bug was found by syzkaller, which encountered the following splat:
    
        BUG: KASAN: use-after-free in free_ldt_struct.part.2+0x10a/0x150 arch/x86/kernel/ldt.c:116
        Read of size 4 at addr ffff88006d2cb7c8 by task kworker/u9:0/3710
    
        CPU: 1 PID: 3710 Comm: kworker/u9:0 Not tainted 4.13.0-rc4-next-20170811 #2
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
        Call Trace:
         __dump_stack lib/dump_stack.c:16 [inline]
         dump_stack+0x194/0x257 lib/dump_stack.c:52
         print_address_description+0x73/0x250 mm/kasan/report.c:252
         kasan_report_error mm/kasan/report.c:351 [inline]
         kasan_report+0x24e/0x340 mm/kasan/report.c:409
         __asan_report_load4_noabort+0x14/0x20 mm/kasan/report.c:429
         free_ldt_struct.part.2+0x10a/0x150 arch/x86/kernel/ldt.c:116
         free_ldt_struct arch/x86/kernel/ldt.c:173 [inline]
         destroy_context_ldt+0x60/0x80 arch/x86/kernel/ldt.c:171
         destroy_context arch/x86/include/asm/mmu_context.h:157 [inline]
         __mmdrop+0xe9/0x530 kernel/fork.c:889
         mmdrop include/linux/sched/mm.h:42 [inline]
         exec_mmap fs/exec.c:1061 [inline]
         flush_old_exec+0x173c/0x1ff0 fs/exec.c:1291
         load_elf_binary+0x81f/0x4ba0 fs/binfmt_elf.c:855
         search_binary_handler+0x142/0x6b0 fs/exec.c:1652
         exec_binprm fs/exec.c:1694 [inline]
         do_execveat_common.isra.33+0x1746/0x22e0 fs/exec.c:1816
         do_execve+0x31/0x40 fs/exec.c:1860
         call_usermodehelper_exec_async+0x457/0x8f0 kernel/umh.c:100
         ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:431
    
        Allocated by task 3700:
         save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
         save_stack+0x43/0xd0 mm/kasan/kasan.c:447
         set_track mm/kasan/kasan.c:459 [inline]
         kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:551
         kmem_cache_alloc_trace+0x136/0x750 mm/slab.c:3627
         kmalloc include/linux/slab.h:493 [inline]
         alloc_ldt_struct+0x52/0x140 arch/x86/kernel/ldt.c:67
         write_ldt+0x7b7/0xab0 arch/x86/kernel/ldt.c:277
         sys_modify_ldt+0x1ef/0x240 arch/x86/kernel/ldt.c:307
         entry_SYSCALL_64_fastpath+0x1f/0xbe
    
        Freed by task 3700:
         save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
         save_stack+0x43/0xd0 mm/kasan/kasan.c:447
         set_track mm/kasan/kasan.c:459 [inline]
         kasan_slab_free+0x71/0xc0 mm/kasan/kasan.c:524
         __cache_free mm/slab.c:3503 [inline]
         kfree+0xca/0x250 mm/slab.c:3820
         free_ldt_struct.part.2+0xdd/0x150 arch/x86/kernel/ldt.c:121
         free_ldt_struct arch/x86/kernel/ldt.c:173 [inline]
         destroy_context_ldt+0x60/0x80 arch/x86/kernel/ldt.c:171
         destroy_context arch/x86/include/asm/mmu_context.h:157 [inline]
         __mmdrop+0xe9/0x530 kernel/fork.c:889
         mmdrop include/linux/sched/mm.h:42 [inline]
         __mmput kernel/fork.c:916 [inline]
         mmput+0x541/0x6e0 kernel/fork.c:927
         copy_process.part.36+0x22e1/0x4af0 kernel/fork.c:1931
         copy_process kernel/fork.c:1546 [inline]
         _do_fork+0x1ef/0xfb0 kernel/fork.c:2025
         SYSC_clone kernel/fork.c:2135 [inline]
         SyS_clone+0x37/0x50 kernel/fork.c:2129
         do_syscall_64+0x26c/0x8c0 arch/x86/entry/common.c:287
         return_from_SYSCALL_64+0x0/0x7a
    
    Here is a C reproducer:
    
        #include <asm/ldt.h>
        #include <pthread.h>
        #include <signal.h>
        #include <stdlib.h>
        #include <sys/syscall.h>
        #include <sys/wait.h>
        #include <unistd.h>
    
        static void *fork_thread(void *_arg)
        {
            fork();
        }
    
        int main(void)
        {
            struct user_desc desc = { .entry_number = 8191 };
    
            syscall(__NR_modify_ldt, 1, &desc, sizeof(desc));
    
            for (;;) {
                if (fork() == 0) {
                    pthread_t t;
    
                    srand(getpid());
                    pthread_create(&t, NULL, fork_thread, NULL);
                    usleep(rand() % 10000);
                    syscall(__NR_exit_group, 0);
                }
                wait(NULL);
            }
        }
    
    Note: the reproducer takes advantage of the fact that alloc_ldt_struct()
    may use vmalloc() to allocate a large ->entries array, and after
    commit:
    
      5d17a73a2ebe ("vmalloc: back off when the current task is killed")
    
    it is possible for userspace to fail a task's vmalloc() by
    sending a fatal signal, e.g. via exit_group().  It would be more
    difficult to reproduce this bug on kernels without that commit.
    
    This bug only affected kernels with CONFIG_MODIFY_LDT_SYSCALL=y.
    
    Signed-off-by: Eric Biggers <ebiggers@google.com>
    Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
    Cc: Andrew Morton <akpm@linux-foundation.org>
    Cc: Andy Lutomirski <luto@amacapital.net>
    Cc: Borislav Petkov <bp@alien8.de>
    Cc: Brian Gerst <brgerst@gmail.com>
    Cc: Christoph Hellwig <hch@lst.de>
    Cc: Denys Vlasenko <dvlasenk@redhat.com>
    Cc: Dmitry Vyukov <dvyukov@google.com>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Michal Hocko <mhocko@suse.com>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Rik van Riel <riel@redhat.com>
    Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: linux-mm@kvack.org
    Fixes: 39a0526fb3f7 ("x86/mm: Factor out LDT init from context init")
    Link: http://lkml.kernel.org/r/20170824175029.76040-1-ebiggers3@gmail.com
    Signed-off-by: Ingo Molnar <mingo@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 2e11eedec6f06c30b60fd2e67f022132483fd418
Author: Nicholas Piggin <npiggin@gmail.com>
Date:   Tue Aug 22 18:43:48 2017 +1000

    timers: Fix excessive granularity of new timers after a nohz idle
    
    commit 2fe59f507a65dbd734b990a11ebc7488f6f87a24 upstream.
    
    When a timer base is idle, it is forwarded when a new timer is added
    to ensure that granularity does not become excessive. When not idle,
    the timer tick is expected to increment the base.
    
    However there are several problems:
    
    - If an existing timer is modified, the base is forwarded only after
      the index is calculated.
    
    - The base is not forwarded by add_timer_on.
    
    - There is a window after a timer is restarted from a nohz idle, after
      it is marked not-idle and before the timer tick on this CPU, where a
      timer may be added but the ancient base does not get forwarded.
    
    These result in excessive granularity (a 1 jiffy timeout can blow out
    to 100s of jiffies), which cause the rcu lockup detector to trigger,
    among other things.
    
    Fix this by keeping track of whether the timer base has been idle
    since it was last run or forwarded, and if so then forward it before
    adding a new timer.
    
    There is still a case where mod_timer optimises the case of a pending
    timer mod with the same expiry time, where the timer can see excessive
    granularity relative to the new, shorter interval. A comment is added,
    but it's not changed because it is an important fastpath for
    networking.
    
    This has been tested and found to fix the RCU softlockup messages.
    
    Testing was also done with tracing to measure requested versus
    achieved wakeup latencies for all non-deferrable timers in an idle
    system (with no lockup watchdogs running). Wakeup latency relative to
    absolute latency is calculated (note this suffers from round-up skew
    at low absolute times) and analysed:
    
                 max     avg      std
    upstream   506.0    1.20     4.68
    patched      2.0    1.08     0.15
    
    The bug was noticed due to the lockup detector Kconfig changes
    dropping it out of people's .configs and resulting in larger base
    clk skew When the lockup detectors are enabled, no CPU can go idle for
    longer than 4 seconds, which limits the granularity errors.
    Sub-optimal timer behaviour is observable on a smaller scale in that
    case:
    
                 max     avg      std
    upstream     9.0    1.05     0.19
    patched      2.0    1.04     0.11
    
    Fixes: Fixes: a683f390b93f ("timers: Forward the wheel clock whenever possible")
    Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
    Tested-by: David Miller <davem@davemloft.net>
    Cc: dzickus@redhat.com
    Cc: sfr@canb.auug.org.au
    Cc: mpe@ellerman.id.au
    Cc: Stephen Boyd <sboyd@codeaurora.org>
    Cc: linuxarm@huawei.com
    Cc: abdhalee@linux.vnet.ibm.com
    Cc: John Stultz <john.stultz@linaro.org>
    Cc: akpm@linux-foundation.org
    Cc: paulmck@linux.vnet.ibm.com
    Cc: torvalds@linux-foundation.org
    Link: http://lkml.kernel.org/r/20170822084348.21436-1-npiggin@gmail.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 2c0dc7f00e1994656c6e64bd0890bdc1e34e04cf
Author: Mark Rutland <mark.rutland@arm.com>
Date:   Thu Jun 22 15:41:38 2017 +0100

    perf/core: Fix group {cpu,task} validation
    
    commit 64aee2a965cf2954a038b5522f11d2cd2f0f8f3e upstream.
    
    Regardless of which events form a group, it does not make sense for the
    events to target different tasks and/or CPUs, as this leaves the group
    inconsistent and impossible to schedule. The core perf code assumes that
    these are consistent across (successfully intialised) groups.
    
    Core perf code only verifies this when moving SW events into a HW
    context. Thus, we can violate this requirement for pure SW groups and
    pure HW groups, unless the relevant PMU driver happens to perform this
    verification itself. These mismatched groups subsequently wreak havoc
    elsewhere.
    
    For example, we handle watchpoints as SW events, and reserve watchpoint
    HW on a per-CPU basis at pmu::event_init() time to ensure that any event
    that is initialised is guaranteed to have a slot at pmu::add() time.
    However, the core code only checks the group leader's cpu filter (via
    event_filter_match()), and can thus install follower events onto CPUs
    violating thier (mismatched) CPU filters, potentially installing them
    into a CPU without sufficient reserved slots.
    
    This can be triggered with the below test case, resulting in warnings
    from arch backends.
    
      #define _GNU_SOURCE
      #include <linux/hw_breakpoint.h>
      #include <linux/perf_event.h>
      #include <sched.h>
      #include <stdio.h>
      #include <sys/prctl.h>
      #include <sys/syscall.h>
      #include <unistd.h>
    
      static int perf_event_open(struct perf_event_attr *attr, pid_t pid, int cpu,
                               int group_fd, unsigned long flags)
      {
            return syscall(__NR_perf_event_open, attr, pid, cpu, group_fd, flags);
      }
    
      char watched_char;
    
      struct perf_event_attr wp_attr = {
            .type = PERF_TYPE_BREAKPOINT,
            .bp_type = HW_BREAKPOINT_RW,
            .bp_addr = (unsigned long)&watched_char,
            .bp_len = 1,
            .size = sizeof(wp_attr),
      };
    
      int main(int argc, char *argv[])
      {
            int leader, ret;
            cpu_set_t cpus;
    
            /*
             * Force use of CPU0 to ensure our CPU0-bound events get scheduled.
             */
            CPU_ZERO(&cpus);
            CPU_SET(0, &cpus);
            ret = sched_setaffinity(0, sizeof(cpus), &cpus);
            if (ret) {
                    printf("Unable to set cpu affinity\n");
                    return 1;
            }
    
            /* open leader event, bound to this task, CPU0 only */
            leader = perf_event_open(&wp_attr, 0, 0, -1, 0);
            if (leader < 0) {
                    printf("Couldn't open leader: %d\n", leader);
                    return 1;
            }
    
            /*
             * Open a follower event that is bound to the same task, but a
             * different CPU. This means that the group should never be possible to
             * schedule.
             */
            ret = perf_event_open(&wp_attr, 0, 1, leader, 0);
            if (ret < 0) {
                    printf("Couldn't open mismatched follower: %d\n", ret);
                    return 1;
            } else {
                    printf("Opened leader/follower with mismastched CPUs\n");
            }
    
            /*
             * Open as many independent events as we can, all bound to the same
             * task, CPU0 only.
             */
            do {
                    ret = perf_event_open(&wp_attr, 0, 0, -1, 0);
            } while (ret >= 0);
    
            /*
             * Force enable/disble all events to trigger the erronoeous
             * installation of the follower event.
             */
            printf("Opened all events. Toggling..\n");
            for (;;) {
                    prctl(PR_TASK_PERF_EVENTS_DISABLE, 0, 0, 0, 0);
                    prctl(PR_TASK_PERF_EVENTS_ENABLE, 0, 0, 0, 0);
            }
    
            return 0;
      }
    
    Fix this by validating this requirement regardless of whether we're
    moving events.
    
    Signed-off-by: Mark Rutland <mark.rutland@arm.com>
    Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
    Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
    Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Zhou Chengming <zhouchengming1@huawei.com>
    Link: http://lkml.kernel.org/r/1498142498-15758-1-git-send-email-mark.rutland@arm.com
    Signed-off-by: Ingo Molnar <mingo@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit aa2da6c4d548b40f4d6154afd766070bed27ebdb
Author: Steven Rostedt (VMware) <rostedt@goodmis.org>
Date:   Thu Aug 17 16:37:25 2017 -0400

    ftrace: Check for null ret_stack on profile function graph entry function
    
    commit a8f0f9e49956a74718874b800251455680085600 upstream.
    
    There's a small race when function graph shutsdown and the calling of the
    registered function graph entry callback. The callback must not reference
    the task's ret_stack without first checking that it is not NULL. Note, when
    a ret_stack is allocated for a task, it stays allocated until the task exits.
    The problem here, is that function_graph is shutdown, and a new task was
    created, which doesn't have its ret_stack allocated. But since some of the
    functions are still being traced, the callbacks can still be called.
    
    The normal function_graph code handles this, but starting with commit
    8861dd303c ("ftrace: Access ret_stack->subtime only in the function
    profiler") the profiler code references the ret_stack on function entry, but
    doesn't check if it is NULL first.
    
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=196611
    
    Fixes: 8861dd303c ("ftrace: Access ret_stack->subtime only in the function profiler")
    Reported-by: lilydjwg@gmail.com
    Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 1b8ca8851c252b6d7fc2d7b7c880f6ceb1bec27e
Author: Christoph Hellwig <hch@lst.de>
Date:   Thu Aug 24 18:07:02 2017 +0200

    virtio_pci: fix cpu affinity support
    
    commit ba74b6f7fcc07355d087af6939712eed4a454821 upstream.
    
    Commit 0b0f9dc5 ("Revert "virtio_pci: use shared interrupts for
    virtqueues"") removed the adjustment of the pre_vectors for the virtio
    MSI-X vector allocation which was added in commit fb5e31d9 ("virtio:
    allow drivers to request IRQ affinity when creating VQs"). This will
    lead to an incorrect assignment of MSI-X vectors, and potential
    deadlocks when offlining cpus.
    
    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Fixes: 0b0f9dc5 ("Revert "virtio_pci: use shared interrupts for virtqueues")
    Reported-by: YASUAKI ISHIMATSU <yasu.isimatu@gmail.com>
    Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 78f2e29f27f14b8d5787c8cb2942186d9c8acab8
Author: Steven Rostedt (VMware) <rostedt@goodmis.org>
Date:   Wed Aug 2 14:20:54 2017 -0400

    ring-buffer: Have ring_buffer_alloc_read_page() return error on offline CPU
    
    commit a7e52ad7ed82e21273eccff93d1477a7b313aabb upstream.
    
    Chunyu Hu reported:
      "per_cpu trace directories and files are created for all possible cpus,
       but only the cpus which have ever been on-lined have their own per cpu
       ring buffer (allocated by cpuhp threads). While trace_buffers_open, the
       open handler for trace file 'trace_pipe_raw' is always trying to access
       field of ring_buffer_per_cpu, and would panic with the NULL pointer.
    
       Align the behavior of trace_pipe_raw with trace_pipe, that returns -NODEV
       when openning it if that cpu does not have trace ring buffer.
    
       Reproduce:
       cat /sys/kernel/debug/tracing/per_cpu/cpu31/trace_pipe_raw
       (cpu31 is never on-lined, this is a 16 cores x86_64 box)
    
       Tested with:
       1) boot with maxcpus=14, read trace_pipe_raw of cpu15.
          Got -NODEV.
       2) oneline cpu15, read trace_pipe_raw of cpu15.
          Get the raw trace data.
    
       Call trace:
       [ 5760.950995] RIP: 0010:ring_buffer_alloc_read_page+0x32/0xe0
       [ 5760.961678]  tracing_buffers_read+0x1f6/0x230
       [ 5760.962695]  __vfs_read+0x37/0x160
       [ 5760.963498]  ? __vfs_read+0x5/0x160
       [ 5760.964339]  ? security_file_permission+0x9d/0xc0
       [ 5760.965451]  ? __vfs_read+0x5/0x160
       [ 5760.966280]  vfs_read+0x8c/0x130
       [ 5760.967070]  SyS_read+0x55/0xc0
       [ 5760.967779]  do_syscall_64+0x67/0x150
       [ 5760.968687]  entry_SYSCALL64_slow_path+0x25/0x25"
    
    This was introduced by the addition of the feature to reuse reader pages
    instead of re-allocating them. The problem is that the allocation of a
    reader page (which is per cpu) does not check if the cpu is online and set
    up for the ring buffer.
    
    Link: http://lkml.kernel.org/r/1500880866-1177-1-git-send-email-chuhu@redhat.com
    
    Fixes: 73a757e63114 ("ring-buffer: Return reader page back into existing ring buffer")
    Reported-by: Chunyu Hu <chuhu@redhat.com>
    Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 8d4f126c0791c4fa38a294cc4c2fa6ca0b21b562
Author: Chuck Lever <chuck.lever@oracle.com>
Date:   Fri Aug 18 11:12:19 2017 -0400

    nfsd: Limit end of page list when decoding NFSv4 WRITE
    
    commit fc788f64f1f3eb31e87d4f53bcf1ab76590d5838 upstream.
    
    When processing an NFSv4 WRITE operation, argp->end should never
    point past the end of the data in the final page of the page list.
    Otherwise, nfsd4_decode_compound can walk into uninitialized memory.
    
    More critical, nfsd4_decode_write is failing to increment argp->pagelen
    when it increments argp->pagelist.  This can cause later xdr decoders
    to assume more data is available than really is, which can cause server
    crashes on malformed requests.
    
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit ea5745a5117fd1e1cbd4497ea51367b10e3881f0
Author: Ronnie Sahlberg <lsahlber@redhat.com>
Date:   Wed Aug 23 14:48:14 2017 +1000

    cifs: return ENAMETOOLONG for overlong names in cifs_open()/cifs_lookup()
    
    commit d3edede29f74d335f81d95a4588f5f136a9f7dcf upstream.
    
    Add checking for the path component length and verify it is <= the maximum
    that the server advertizes via FileFsAttributeInformation.
    
    With this patch cifs.ko will now return ENAMETOOLONG instead of ENOENT
    when users to access an overlong path.
    
    To test this, try to cd into a (non-existing) directory on a CIFS share
    that has a too long name:
    cd /mnt/aaaaaaaaaaaaaaa...
    
    and it now should show a good error message from the shell:
    bash: cd: /mnt/aaaaaaaaaaaaaaaa...aaaaaa: File name too long
    
    rh bz 1153996
    
    Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
    Signed-off-by: Steve French <smfrench@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 1bc1c4391b7909208ff1184f45fd756fe7a59403
Author: Sachin Prabhu <sprabhu@redhat.com>
Date:   Thu Aug 3 13:09:03 2017 +0530

    cifs: Fix df output for users with quota limits
    
    commit 42bec214d8bd432be6d32a1acb0a9079ecd4d142 upstream.
    
    The df for a SMB2 share triggers a GetInfo call for
    FS_FULL_SIZE_INFORMATION. The values returned are used to populate
    struct statfs.
    
    The problem is that none of the information returned by the call
    contains the total blocks available on the filesystem. Instead we use
    the blocks available to the user ie. quota limitation when filling out
    statfs.f_blocks. The information returned does contain Actual free units
    on the filesystem and is used to populate statfs.f_bfree. For users with
    quota enabled, it can lead to situations where the total free space
    reported is more than the total blocks on the system ending up with df
    reports like the following
    
     # df -h /mnt/a
    Filesystem         Size  Used Avail Use% Mounted on
    //192.168.22.10/a  2.5G -2.3G  2.5G    - /mnt/a
    
    To fix this problem, we instead populate both statfs.f_bfree with the
    same value as statfs.f_bavail ie. CallerAvailableAllocationUnits. This
    is similar to what is done already in the code for cifs and df now
    reports the quota information for the user used to mount the share.
    
     # df --si /mnt/a
    Filesystem         Size  Used Avail Use% Mounted on
    //192.168.22.10/a  2.7G  101M  2.6G   4% /mnt/a
    
    Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
    Signed-off-by: Pierguido Lambri <plambri@redhat.com>
    Signed-off-by: Steve French <smfrench@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 3b278d7e8945ae1e44efc51307692018b7384654
Author: Nicholas Piggin <npiggin@gmail.com>
Date:   Wed Jul 26 22:46:27 2017 +1000

    kbuild: linker script do not match C names unless LD_DEAD_CODE_DATA_ELIMINATION is configured
    
    commit cb87481ee89dbd6609e227afbf64900fb4e5c930 upstream.
    
    The .data and .bss sections were modified in the generic linker script to
    pull in sections named .data.<C identifier>, which are generated by gcc with
    -ffunction-sections and -fdata-sections options.
    
    The problem with this pattern is it can also match section names that Linux
    defines explicitly, e.g., .data.unlikely. This can cause Linux sections to
    get moved into the wrong place.
    
    The way to avoid this is to use ".." separators for explicit section names
    (the dot character is valid in a section name but not a C identifier).
    However currently there are sections which don't follow this rule, so for
    now just disable the wild card by default.
    
    Example: http://marc.info/?l=linux-arm-kernel&m=150106824024221&w=2
    
    Fixes: b67067f1176df ("kbuild: allow archs to select link dead code/data elimination")
    Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
    Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 51f49383a9249e5871241254f64228376a3e0af2
Author: Bharat Potnuri <bharat@chelsio.com>
Date:   Tue Aug 1 10:58:35 2017 +0530

    RDMA/uverbs: Initialize cq_context appropriately
    
    commit 65159c051c45f269cf40a14f9404248f2d524920 upstream.
    
    Initializing cq_context with ev_queue in create_cq(), leads to NULL pointer
    dereference in ib_uverbs_comp_handler(), if application doesnot use completion
    channel. This patch fixes the cq_context initialization.
    
    Fixes: 1e7710f3f65 ("IB/core: Change completion channel to use the reworked")
    Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com>
    Reviewed-by: Matan Barak <matanb@mellanox.com>
    Signed-off-by: Doug Ledford <dledford@redhat.com>
    (cherry picked from commit 699a2d5b1b880b4e4e1c7d55fa25659322cf5b51)
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 53a38dfbb5e48ebc082613a16bd2371ad85a5842
Author: Steven Rostedt (VMware) <rostedt@goodmis.org>
Date:   Wed Aug 23 12:46:27 2017 -0400

    tracing: Fix freeing of filter in create_filter() when set_str is false
    
    commit 8b0db1a5bdfcee0dbfa89607672598ae203c9045 upstream.
    
    Performing the following task with kmemleak enabled:
    
     # cd /sys/kernel/tracing/events/irq/irq_handler_entry/
     # echo 'enable_event:kmem:kmalloc:3 if irq >' > trigger
     # echo 'enable_event:kmem:kmalloc:3 if irq > 31' > trigger
     # echo scan > /sys/kernel/debug/kmemleak
     # cat /sys/kernel/debug/kmemleak
    unreferenced object 0xffff8800b9290308 (size 32):
      comm "bash", pid 1114, jiffies 4294848451 (age 141.139s)
      hex dump (first 32 bytes):
        00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      backtrace:
        [<ffffffff81cef5aa>] kmemleak_alloc+0x4a/0xa0
        [<ffffffff81357938>] kmem_cache_alloc_trace+0x158/0x290
        [<ffffffff81261c09>] create_filter_start.constprop.28+0x99/0x940
        [<ffffffff812639c9>] create_filter+0xa9/0x160
        [<ffffffff81263bdc>] create_event_filter+0xc/0x10
        [<ffffffff812655e5>] set_trigger_filter+0xe5/0x210
        [<ffffffff812660c4>] event_enable_trigger_func+0x324/0x490
        [<ffffffff812652e2>] event_trigger_write+0x1a2/0x260
        [<ffffffff8138cf87>] __vfs_write+0xd7/0x380
        [<ffffffff8138f421>] vfs_write+0x101/0x260
        [<ffffffff8139187b>] SyS_write+0xab/0x130
        [<ffffffff81cfd501>] entry_SYSCALL_64_fastpath+0x1f/0xbe
        [<ffffffffffffffff>] 0xffffffffffffffff
    
    The function create_filter() is passed a 'filterp' pointer that gets
    allocated, and if "set_str" is true, it is up to the caller to free it, even
    on error. The problem is that the pointer is not freed by create_filter()
    when set_str is false. This is a bug, and it is not up to the caller to free
    the filter on error if it doesn't care about the string.
    
    Link: http://lkml.kernel.org/r/1502705898-27571-2-git-send-email-chuhu@redhat.com
    
    Fixes: 38b78eb85 ("tracing: Factorize filter creation")
    Reported-by: Chunyu Hu <chuhu@redhat.com>
    Tested-by: Chunyu Hu <chuhu@redhat.com>
    Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 983ba8148e187145877e3d91a8520440f409158a
Author: Chunyu Hu <chuhu@redhat.com>
Date:   Mon Aug 14 18:18:17 2017 +0800

    tracing: Fix kmemleak in tracing_map_array_free()
    
    commit 475bb3c69ab05df2a6ecef6acc2393703d134180 upstream.
    
    kmemleak reported the below leak when I was doing clear of the hist
    trigger. With this patch, the kmeamleak is gone.
    
    unreferenced object 0xffff94322b63d760 (size 32):
      comm "bash", pid 1522, jiffies 4403687962 (age 2442.311s)
      hex dump (first 32 bytes):
        00 01 00 00 04 00 00 00 08 00 00 00 ff 00 00 00  ................
        10 00 00 00 00 00 00 00 80 a8 7a f2 31 94 ff ff  ..........z.1...
      backtrace:
        [<ffffffff9e96c27a>] kmemleak_alloc+0x4a/0xa0
        [<ffffffff9e424cba>] kmem_cache_alloc_trace+0xca/0x1d0
        [<ffffffff9e377736>] tracing_map_array_alloc+0x26/0x140
        [<ffffffff9e261be0>] kretprobe_trampoline+0x0/0x50
        [<ffffffff9e38b935>] create_hist_data+0x535/0x750
        [<ffffffff9e38bd47>] event_hist_trigger_func+0x1f7/0x420
        [<ffffffff9e38893d>] event_trigger_write+0xfd/0x1a0
        [<ffffffff9e44dfc7>] __vfs_write+0x37/0x170
        [<ffffffff9e44f552>] vfs_write+0xb2/0x1b0
        [<ffffffff9e450b85>] SyS_write+0x55/0xc0
        [<ffffffff9e203857>] do_syscall_64+0x67/0x150
        [<ffffffff9e977ce7>] return_from_SYSCALL_64+0x0/0x6a
        [<ffffffffffffffff>] 0xffffffffffffffff
    unreferenced object 0xffff9431f27aa880 (size 128):
      comm "bash", pid 1522, jiffies 4403687962 (age 2442.311s)
      hex dump (first 32 bytes):
        00 00 8c 2a 32 94 ff ff 00 f0 8b 2a 32 94 ff ff  ...*2......*2...
        00 e0 8b 2a 32 94 ff ff 00 d0 8b 2a 32 94 ff ff  ...*2......*2...
      backtrace:
        [<ffffffff9e96c27a>] kmemleak_alloc+0x4a/0xa0
        [<ffffffff9e425348>] __kmalloc+0xe8/0x220
        [<ffffffff9e3777c1>] tracing_map_array_alloc+0xb1/0x140
        [<ffffffff9e261be0>] kretprobe_trampoline+0x0/0x50
        [<ffffffff9e38b935>] create_hist_data+0x535/0x750
        [<ffffffff9e38bd47>] event_hist_trigger_func+0x1f7/0x420
        [<ffffffff9e38893d>] event_trigger_write+0xfd/0x1a0
        [<ffffffff9e44dfc7>] __vfs_write+0x37/0x170
        [<ffffffff9e44f552>] vfs_write+0xb2/0x1b0
        [<ffffffff9e450b85>] SyS_write+0x55/0xc0
        [<ffffffff9e203857>] do_syscall_64+0x67/0x150
        [<ffffffff9e977ce7>] return_from_SYSCALL_64+0x0/0x6a
        [<ffffffffffffffff>] 0xffffffffffffffff
    
    Link: http://lkml.kernel.org/r/1502705898-27571-1-git-send-email-chuhu@redhat.com
    
    Fixes: 08d43a5fa063 ("tracing: Add lock-free tracing_map")
    Signed-off-by: Chunyu Hu <chuhu@redhat.com>
    Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit a23e782823d6848c193b734abf98fe0da6219217
Author: Dan Carpenter <dan.carpenter@oracle.com>
Date:   Tue Aug 1 14:02:01 2017 +0300

    tracing: Missing error code in tracer_alloc_buffers()
    
    commit 147d88e0b5eb90191bc5c12ca0a3c410b75a13d2 upstream.
    
    If ring_buffer_alloc() or one of the next couple function calls fail
    then we should return -ENOMEM but the current code returns success.
    
    Link: http://lkml.kernel.org/r/20170801110201.ajdkct7vwzixahvx@mwanda
    
    Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
    Cc: Ingo Molnar <mingo@redhat.com>
    Fixes: b32614c03413 ('tracing/rb: Convert to hotplug state machine')
    Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
    Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 3888c3aeb6bce2c50d2b84f254ac0bcf8f96fdb7
Author: Steven Rostedt (VMware) <rostedt@goodmis.org>
Date:   Tue Aug 1 12:01:52 2017 -0400

    tracing: Call clear_boot_tracer() at lateinit_sync
    
    commit 4bb0f0e73c8c30917d169c4a0f1ac083690c545b upstream.
    
    The clear_boot_tracer function is used to reset the default_bootup_tracer
    string to prevent it from being accessed after boot, as it originally points
    to init data. But since clear_boot_tracer() is called via the
    init_lateinit() call, it races with the initcall for registering the hwlat
    tracer. If someone adds "ftrace=hwlat" to the kernel command line, depending
    on how the linker sets up the text, the saved command line may be cleared,
    and the hwlat tracer never is initialized.
    
    Simply have the clear_boot_tracer() be called by initcall_lateinit_sync() as
    that's for tasks to be called after lateinit.
    
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=196551
    
    Fixes: e7c15cd8a ("tracing: Added hardware latency tracer")
    Reported-by: Zamir SUN <sztsian@gmail.com>
    Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 1344db83ee17ca11be551ed074a64a28d8e17f1e
Author: Sakari Ailus <sakari.ailus@linux.intel.com>
Date:   Tue Aug 22 23:39:58 2017 +0300

    ACPI: device property: Fix node lookup in acpi_graph_get_child_prop_value()
    
    commit b5212f57da145e53df790a7e211d94daac768bf8 upstream.
    
    acpi_graph_get_child_prop_value() is intended to find a child node with a
    certain property value pair. The check
    
            if (!fwnode_property_read_u32(fwnode, prop_name, &nr))
                    continue;
    
    is faulty: fwnode_property_read_u32() returns zero on success, not on
    failure, leading to comparing values only if the searched property was not
    found.
    
    Moreover, the check is made against the parent device node instead of
    the child one as it should be.
    
    Fixes: 79389a83bc38 (ACPI / property: Add support for remote endpoints)
    Reported-by: Hyungwoo Yang <hyungwoo.yang@intel.com>
    Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
    [ rjw: Changelog ]
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit dbe5b2d70cfdc3e1df1ceb3f715c6ef7d17fc566
Author: Alex Deucher <alexdeucher@gmail.com>
Date:   Thu Aug 17 16:36:51 2017 -0400

    Revert "drm/amdgpu: fix vblank_time when displays are off"
    
    This reverts commit 2dc1889ebf8501b0edf125e89a30e1cf3744a2a7.
    
    Fixes a suspend and resume regression.
    
    bug: https://bugzilla.kernel.org/show_bug.cgi?id=196615
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 4ac9a5daaf821781de8a16ab920132669f78195a
Author: fred gao <fred.gao@intel.com>
Date:   Wed Aug 16 15:48:03 2017 +0800

    drm/i915/gvt: Fix the kernel null pointer error
    
    commit ffeaf9aaf97b4bdaf114d6df52f800d71918768c upstream.
    
    once error happens in shadow_indirect_ctx function, the variable
    wa_ctx->indirect_ctx.obj is not initialized but accessed, so the
    kernel null point panic occurs.
    
    Fixes: 894cf7d15634 ("drm/i915/gvt: i915_gem_object_create() returns an error pointer")
    Signed-off-by: fred gao <fred.gao@intel.com>
    Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit bbb04b377f851803ced24a82e873518ea78683c4
Author: Jani Nikula <jani.nikula@intel.com>
Date:   Fri Aug 11 14:39:07 2017 +0300

    drm/i915/vbt: ignore extraneous child devices for a port
    
    commit 7c648bde211baeda7a029bd6be4957e8be48d8c9 upstream.
    
    Ever since we've parsed VBT child devices, starting from 6acab15a7b0d
    ("drm/i915: use the HDMI DDI buffer translations from VBT"), we've
    ignored the child device information if more than one child device
    references the same port. The rationale for this seems lost in time.
    
    Since commit 311a20949f04 ("drm/i915: don't init DP or HDMI when not
    supported by DDI port") we started using this information more to skip
    HDMI/DP init if the port wasn't there per VBT child devices. However, at
    the same time it added port defaults without further explanation.
    
    Thus, if the child device info was skipped due to multiple child devices
    referencing the same port, the device info would be retrieved from the
    somewhat arbitrary defaults.
    
    Finally, when commit bb1d132935c2 ("drm/i915/vbt: split out defaults
    that are set when there is no VBT") stopped initializing the defaults
    whenever VBT is present, thus trusting the VBT more, we stopped
    initializing ports which were referenced by more than one child device.
    
    Apparently at least Asus UX305UA, UX305U, and UX306U laptops have VBT
    child device blocks which cause this behaviour. Arguably they were
    shipped with a broken VBT.
    
    Relax the rules for multiple references to the same port, and use the
    first child device info to reference a port. Retain the logic to debug
    log about this, though.
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101745
    Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196233
    Fixes: bb1d132935c2 ("drm/i915/vbt: split out defaults that are set when there is no VBT")
    Tested-by: Oliver Weißbarth <mail@oweissbarth.de>
    Reported-by: Oliver Weißbarth <mail@oweissbarth.de>
    Reported-by: Didier G <didierg-divers@orange.fr>
    Reported-by: Giles Anderson <agander@gmail.com>
    Cc: Manasi Navare <manasi.d.navare@intel.com>
    Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
    Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
    Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
    Signed-off-by: Jani Nikula <jani.nikula@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20170811113907.6716-1-jani.nikula@intel.com
    Signed-off-by: Jani Nikula <jani.nikula@intel.com>
    (cherry picked from commit b5273d72750555a673040070bfb23c454a7cd3ef)
    Signed-off-by: Jani Nikula <jani.nikula@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit d76df456a313a7f44fd4e4fcfc98377066413fe3
Author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Date:   Tue Aug 15 11:57:06 2017 +0200

    drm/atomic: If the atomic check fails, return its value first
    
    commit a0ffc51e20e90e0c1c2491de2b4b03f48b6caaba upstream.
    
    The last part of drm_atomic_check_only is testing whether we need to
    fail with -EINVAL when modeset is not allowed, but forgets to return
    the value when atomic_check() fails first.
    
    This results in -EDEADLK being replaced by -EINVAL, and the sanity
    check in drm_modeset_drop_locks kicks in:
    
    [  308.531734] ------------[ cut here ]------------
    [  308.531791] WARNING: CPU: 0 PID: 1886 at drivers/gpu/drm/drm_modeset_lock.c:217 drm_modeset_drop_locks+0x33/0xc0 [drm]
    [  308.531828] Modules linked in:
    [  308.532050] CPU: 0 PID: 1886 Comm: kms_atomic Tainted: G     U  W 4.13.0-rc5-patser+ #5225
    [  308.532082] Hardware name: NUC5i7RYB, BIOS RYBDWi35.86A.0246.2015.0309.1355 03/09/2015
    [  308.532124] task: ffff8800cd9dae00 task.stack: ffff8800ca3b8000
    [  308.532168] RIP: 0010:drm_modeset_drop_locks+0x33/0xc0 [drm]
    [  308.532189] RSP: 0018:ffff8800ca3bf980 EFLAGS: 00010282
    [  308.532211] RAX: dffffc0000000000 RBX: ffff8800ca3bfaf8 RCX: 0000000013a171e6
    [  308.532235] RDX: 1ffff10019477f69 RSI: ffffffffa8ba4fa0 RDI: ffff8800ca3bfb48
    [  308.532258] RBP: ffff8800ca3bf998 R08: 0000000000000000 R09: 0000000000000003
    [  308.532281] R10: 0000000079dbe066 R11: 00000000f760b34b R12: 0000000000000001
    [  308.532304] R13: dffffc0000000000 R14: 00000000ffffffea R15: ffff880096889680
    [  308.532328] FS:  00007ff00959cec0(0000) GS:ffff8800d4e00000(0000) knlGS:0000000000000000
    [  308.532359] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [  308.532380] CR2: 0000000000000008 CR3: 00000000ca2e3000 CR4: 00000000003406f0
    [  308.532402] Call Trace:
    [  308.532440]  drm_mode_atomic_ioctl+0x19fa/0x1c00 [drm]
    [  308.532488]  ? drm_atomic_set_property+0x1220/0x1220 [drm]
    [  308.532565]  ? avc_has_extended_perms+0xc39/0xff0
    [  308.532593]  ? lock_downgrade+0x610/0x610
    [  308.532640]  ? drm_atomic_set_property+0x1220/0x1220 [drm]
    [  308.532680]  drm_ioctl_kernel+0x154/0x1a0 [drm]
    [  308.532755]  drm_ioctl+0x624/0x8f0 [drm]
    [  308.532858]  ? drm_atomic_set_property+0x1220/0x1220 [drm]
    [  308.532976]  ? drm_getunique+0x210/0x210 [drm]
    [  308.533061]  do_vfs_ioctl+0xd92/0xe40
    [  308.533121]  ? ioctl_preallocate+0x1b0/0x1b0
    [  308.533160]  ? selinux_capable+0x20/0x20
    [  308.533191]  ? do_fcntl+0x1b1/0xbf0
    [  308.533219]  ? kasan_slab_free+0xa2/0xb0
    [  308.533249]  ? f_getown+0x4b/0xa0
    [  308.533278]  ? putname+0xcf/0xe0
    [  308.533309]  ? security_file_ioctl+0x57/0x90
    [  308.533342]  SyS_ioctl+0x4e/0x80
    [  308.533374]  entry_SYSCALL_64_fastpath+0x18/0xad
    [  308.533405] RIP: 0033:0x7ff00779e4d7
    [  308.533431] RSP: 002b:00007fff66a043d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
    [  308.533481] RAX: ffffffffffffffda RBX: 000000e7c7ca5910 RCX: 00007ff00779e4d7
    [  308.533560] RDX: 00007fff66a04430 RSI: 00000000c03864bc RDI: 0000000000000003
    [  308.533608] RBP: 00007ff007a5fb00 R08: 000000e7c7ca4620 R09: 000000e7c7ca5e60
    [  308.533647] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000070
    [  308.533685] R13: 0000000000000000 R14: 0000000000000000 R15: 000000e7c7ca5930
    [  308.533770] Code: ff df 55 48 89 e5 41 55 41 54 53 48 89 fb 48 83 c7
    50 48 89 fa 48 c1 ea 03 80 3c 02 00 74 05 e8 94 d4 16 e7 48 83 7b 50 00
    74 02 <0f> ff 4c 8d 6b 58 48 b8 00 00 00 00 00 fc ff df 4c 89 ea 48 c1
    [  308.534086] ---[ end trace 77f11e53b1df44ad ]---
    
    Solve this by adding the missing return.
    
    This is also a bugfix because we could end up rejecting updates with
    -EINVAL because of a early -EDEADLK, while if atomic_check ran to
    completion it might have downgraded the modeset to a fastset.
    
    Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
    Testcase: kms_atomic
    Link: https://patchwork.freedesktop.org/patch/msgid/20170815095706.23624-1-maarten.lankhorst@linux.intel.com
    Fixes: d34f20d6e2f2 ("drm: Atomic modeset ioctl")
    Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 247122f138c0b84f7b2bc163166b83b22550e827
Author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Date:   Mon Aug 14 12:07:21 2017 +0200

    drm/atomic: Handle -EDEADLK with out-fences correctly
    
    commit 7f5d6dac548b983702dd7aac1d463bd88dff50a8 upstream.
    
    complete_crtc_signaling is freeing fence_state, but when retrying
    num_fences and fence_state are not zero'd. This caused duplicate
    fd's in the fence_state array, followed by a BUG_ON in fs/file.c
    because we reallocate freed memory, and installing over an existing
    fd, or potential other fun.
    
    Zero fence_state and num_fences correctly in the retry loop, which
    allows kms_atomic_transition to pass.
    
    Fixes: beaf5af48034 ("drm/fence: add out-fences support")
    Cc: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
    Cc: Brian Starkey <brian.starkey@arm.com> (v10)
    Cc: Sean Paul <seanpaul@chromium.org>
    Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
    Cc: Jani Nikula <jani.nikula@linux.intel.com>
    Cc: David Airlie <airlied@linux.ie>
    Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
    Testcase: kms_atomic_transitions.plane-all-modeset-transition-fencing
    (with CONFIG_DEBUG_WW_MUTEX_SLOWPATH=y)
    Link: https://patchwork.freedesktop.org/patch/msgid/20170814100721.13340-1-maarten.lankhorst@linux.intel.com
    Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> #intel-gfx on irc
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit d4ae641cc285e51311124495cde00bf7df9707c4
Author: Jonathan Liu <net147@gmail.com>
Date:   Mon Jul 10 16:55:04 2017 +1000

    drm/sun4i: Implement drm_driver lastclose to restore fbdev console
    
    commit 2a596fc9d974bb040eda9ab70bf8756fcaaa6afe upstream.
    
    The drm_driver lastclose callback is called when the last userspace
    DRM client has closed. Call drm_fbdev_cma_restore_mode to restore
    the fbdev console otherwise the fbdev console will stop working.
    
    Fixes: 9026e0d122ac ("drm: Add Allwinner A10 Display Engine support")
    Tested-by: Olliver Schinagl <oliver@schinagl.nl>
    Reviewed-by: Chen-Yu Tsai <wens@csie.org>
    Signed-off-by: Jonathan Liu <net147@gmail.com>
    Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 08353913312acb9620055189d8deea1d4eee0905
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Sat Aug 19 13:05:58 2017 +0100

    drm: Release driver tracking before making the object available again
    
    commit fe4600a548f2763dec91b3b27a1245c370ceee2a upstream.
    
    This is the same bug as we fixed in commit f6cd7daecff5 ("drm: Release
    driver references to handle before making it available again"), but now
    the exposure is via the PRIME lookup tables. If we remove the
    object/handle from the PRIME lut, then a new request for the same
    object/fd will generate a new handle, thus for a short window that
    object is known to userspace by two different handles. Fix this by
    releasing the driver tracking before PRIME.
    
    Fixes: 0ff926c7d4f0 ("drm/prime: add exported buffers to current fprivs
    imported buffer list (v2)")
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Cc: David Airlie <airlied@linux.ie>
    Cc: Daniel Vetter <daniel.vetter@intel.com>
    Cc: Rob Clark <robdclark@gmail.com>
    Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
    Cc: Thierry Reding <treding@nvidia.com>
    Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
    Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20170819120558.6465-1-chris@chris-wilson.co.uk
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit b96c156551250c20042074cb67c221ea64058a98
Author: Nikhil Mahale <nmahale@nvidia.com>
Date:   Wed Aug 9 09:23:01 2017 +0530

    drm: Fix framebuffer leak
    
    commit 491ab4700d1b64f5cf2f9055e01613a923df5fab upstream.
    
    Do not leak framebuffer if client provided crtc id found invalid.
    
    Signed-off-by: Nikhil Mahale <nmahale@nvidia.com>
    Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
    Link: https://patchwork.freedesktop.org/patch/msgid/1502250781-5779-1-git-send-email-nmahale@nvidia.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 865d89f80907de6aa30e447f578061c9850b3905
Author: Dave Martin <Dave.Martin@arm.com>
Date:   Fri Aug 18 16:57:01 2017 +0100

    arm64: fpsimd: Prevent registers leaking across exec
    
    commit 096622104e14d8a1db4860bd557717067a0515d2 upstream.
    
    There are some tricky dependencies between the different stages of
    flushing the FPSIMD register state during exec, and these can race
    with context switch in ways that can cause the old task's regs to
    leak across.  In particular, a context switch during the memset() can
    cause some of the task's old FPSIMD registers to reappear.
    
    Disabling preemption for this small window would be no big deal for
    performance: preemption is already disabled for similar scenarios
    like updating the FPSIMD registers in sigreturn.
    
    So, instead of rearranging things in ways that might swap existing
    subtle bugs for new ones, this patch just disables preemption
    around the FPSIMD state flushing so that races of this type can't
    occur here.  This brings fpsimd_flush_thread() into line with other
    code paths.
    
    Fixes: 674c242c9323 ("arm64: flush FP/SIMD state correctly after execve()")
    Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
    Signed-off-by: Dave Martin <Dave.Martin@arm.com>
    Signed-off-by: Will Deacon <will.deacon@arm.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 1c229d7ad7b91136a3ae904ecbba305c95577b92
Author: Pavel Tatashin <pasha.tatashin@oracle.com>
Date:   Fri Aug 25 15:55:46 2017 -0700

    mm/memblock.c: reversed logic in memblock_discard()
    
    commit 91b540f98872a206ea1c49e4aa6ea8eed0886644 upstream.
    
    In recently introduced memblock_discard() there is a reversed logic bug.
    Memory is freed of static array instead of dynamically allocated one.
    
    Link: http://lkml.kernel.org/r/1503511441-95478-2-git-send-email-pasha.tatashin@oracle.com
    Fixes: 3010f876500f ("mm: discard memblock data later")
    Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
    Reported-by: Woody Suwalski <terraluna977@gmail.com>
    Tested-by: Woody Suwalski <terraluna977@gmail.com>
    Acked-by: Michal Hocko <mhocko@suse.com>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit f5024bb32d4d50b77f4fbc1e7251cf0f21def88e
Author: Eric Biggers <ebiggers@google.com>
Date:   Fri Aug 25 15:55:43 2017 -0700

    fork: fix incorrect fput of ->exe_file causing use-after-free
    
    commit 2b7e8665b4ff51c034c55df3cff76518d1a9ee3a upstream.
    
    Commit 7c051267931a ("mm, fork: make dup_mmap wait for mmap_sem for
    write killable") made it possible to kill a forking task while it is
    waiting to acquire its ->mmap_sem for write, in dup_mmap().
    
    However, it was overlooked that this introduced an new error path before
    a reference is taken on the mm_struct's ->exe_file.  Since the
    ->exe_file of the new mm_struct was already set to the old ->exe_file by
    the memcpy() in dup_mm(), it was possible for the mmput() in the error
    path of dup_mm() to drop a reference to ->exe_file which was never
    taken.
    
    This caused the struct file to later be freed prematurely.
    
    Fix it by updating mm_init() to NULL out the ->exe_file, in the same
    place it clears other things like the list of mmaps.
    
    This bug was found by syzkaller.  It can be reproduced using the
    following C program:
    
        #define _GNU_SOURCE
        #include <pthread.h>
        #include <stdlib.h>
        #include <sys/mman.h>
        #include <sys/syscall.h>
        #include <sys/wait.h>
        #include <unistd.h>
    
        static void *mmap_thread(void *_arg)
        {
            for (;;) {
                mmap(NULL, 0x1000000, PROT_READ,
                     MAP_POPULATE|MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
            }
        }
    
        static void *fork_thread(void *_arg)
        {
            usleep(rand() % 10000);
            fork();
        }
    
        int main(void)
        {
            fork();
            fork();
            fork();
            for (;;) {
                if (fork() == 0) {
                    pthread_t t;
    
                    pthread_create(&t, NULL, mmap_thread, NULL);
                    pthread_create(&t, NULL, fork_thread, NULL);
                    usleep(rand() % 10000);
                    syscall(__NR_exit_group, 0);
                }
                wait(NULL);
            }
        }
    
    No special kernel config options are needed.  It usually causes a NULL
    pointer dereference in __remove_shared_vm_struct() during exit, or in
    dup_mmap() (which is usually inlined into copy_process()) during fork.
    Both are due to a vm_area_struct's ->vm_file being used after it's
    already been freed.
    
    Google Bug Id: 64772007
    
    Link: http://lkml.kernel.org/r/20170823211408.31198-1-ebiggers3@gmail.com
    Fixes: 7c051267931a ("mm, fork: make dup_mmap wait for mmap_sem for write killable")
    Signed-off-by: Eric Biggers <ebiggers@google.com>
    Tested-by: Mark Rutland <mark.rutland@arm.com>
    Acked-by: Michal Hocko <mhocko@suse.com>
    Cc: Dmitry Vyukov <dvyukov@google.com>
    Cc: Ingo Molnar <mingo@kernel.org>
    Cc: Konstantin Khlebnikov <koct9i@gmail.com>
    Cc: Oleg Nesterov <oleg@redhat.com>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 4823f4630bfd9bf3b4c9c47311b5c5f687729c52
Author: Eric Biggers <ebiggers@google.com>
Date:   Fri Aug 25 15:55:39 2017 -0700

    mm/madvise.c: fix freeing of locked page with MADV_FREE
    
    commit 263630e8d176d87308481ebdcd78ef9426739c6b upstream.
    
    If madvise(..., MADV_FREE) split a transparent hugepage, it called
    put_page() before unlock_page().
    
    This was wrong because put_page() can free the page, e.g. if a
    concurrent madvise(..., MADV_DONTNEED) has removed it from the memory
    mapping. put_page() then rightfully complained about freeing a locked
    page.
    
    Fix this by moving the unlock_page() before put_page().
    
    This bug was found by syzkaller, which encountered the following splat:
    
        BUG: Bad page state in process syzkaller412798  pfn:1bd800
        page:ffffea0006f60000 count:0 mapcount:0 mapping:          (null) index:0x20a00
        flags: 0x200000000040019(locked|uptodate|dirty|swapbacked)
        raw: 0200000000040019 0000000000000000 0000000000020a00 00000000ffffffff
        raw: ffffea0006f60020 ffffea0006f60020 0000000000000000 0000000000000000
        page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
        bad because of flags: 0x1(locked)
        Modules linked in:
        CPU: 1 PID: 3037 Comm: syzkaller412798 Not tainted 4.13.0-rc5+ #35
        Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
        Call Trace:
         __dump_stack lib/dump_stack.c:16 [inline]
         dump_stack+0x194/0x257 lib/dump_stack.c:52
         bad_page+0x230/0x2b0 mm/page_alloc.c:565
         free_pages_check_bad+0x1f0/0x2e0 mm/page_alloc.c:943
         free_pages_check mm/page_alloc.c:952 [inline]
         free_pages_prepare mm/page_alloc.c:1043 [inline]
         free_pcp_prepare mm/page_alloc.c:1068 [inline]
         free_hot_cold_page+0x8cf/0x12b0 mm/page_alloc.c:2584
         __put_single_page mm/swap.c:79 [inline]
         __put_page+0xfb/0x160 mm/swap.c:113
         put_page include/linux/mm.h:814 [inline]
         madvise_free_pte_range+0x137a/0x1ec0 mm/madvise.c:371
         walk_pmd_range mm/pagewalk.c:50 [inline]
         walk_pud_range mm/pagewalk.c:108 [inline]
         walk_p4d_range mm/pagewalk.c:134 [inline]
         walk_pgd_range mm/pagewalk.c:160 [inline]
         __walk_page_range+0xc3a/0x1450 mm/pagewalk.c:249
         walk_page_range+0x200/0x470 mm/pagewalk.c:326
         madvise_free_page_range.isra.9+0x17d/0x230 mm/madvise.c:444
         madvise_free_single_vma+0x353/0x580 mm/madvise.c:471
         madvise_dontneed_free mm/madvise.c:555 [inline]
         madvise_vma mm/madvise.c:664 [inline]
         SYSC_madvise mm/madvise.c:832 [inline]
         SyS_madvise+0x7d3/0x13c0 mm/madvise.c:760
         entry_SYSCALL_64_fastpath+0x1f/0xbe
    
    Here is a C reproducer:
    
        #define _GNU_SOURCE
        #include <pthread.h>
        #include <sys/mman.h>
        #include <unistd.h>
    
        #define MADV_FREE   8
        #define PAGE_SIZE   4096
    
        static void *mapping;
        static const size_t mapping_size = 0x1000000;
    
        static void *madvise_thrproc(void *arg)
        {
            madvise(mapping, mapping_size, (long)arg);
        }
    
        int main(void)
        {
            pthread_t t[2];
    
            for (;;) {
                mapping = mmap(NULL, mapping_size, PROT_WRITE,
                               MAP_POPULATE|MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
    
                munmap(mapping + mapping_size / 2, PAGE_SIZE);
    
                pthread_create(&t[0], 0, madvise_thrproc, (void*)MADV_DONTNEED);
                pthread_create(&t[1], 0, madvise_thrproc, (void*)MADV_FREE);
                pthread_join(t[0], NULL);
                pthread_join(t[1], NULL);
                munmap(mapping, mapping_size);
            }
        }
    
    Note: to see the splat, CONFIG_TRANSPARENT_HUGEPAGE=y and
    CONFIG_DEBUG_VM=y are needed.
    
    Google Bug Id: 64696096
    
    Link: http://lkml.kernel.org/r/20170823205235.132061-1-ebiggers3@gmail.com
    Fixes: 854e9ed09ded ("mm: support madvise(MADV_FREE)")
    Signed-off-by: Eric Biggers <ebiggers@google.com>
    Acked-by: David Rientjes <rientjes@google.com>
    Acked-by: Minchan Kim <minchan@kernel.org>
    Acked-by: Michal Hocko <mhocko@suse.com>
    Cc: Dmitry Vyukov <dvyukov@google.com>
    Cc: Hugh Dickins <hughd@google.com>
    Cc: Andrea Arcangeli <aarcange@redhat.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit c237efed8b358f481c3dbe75638b62845fbc1cb7
Author: Ulf Hansson <ulf.hansson@linaro.org>
Date:   Wed Aug 9 15:28:22 2017 +0200

    i2c: designware: Fix system suspend
    
    commit a23318feeff662c8d25d21623daebdd2e55ec221 upstream.
    
    The commit 8503ff166504 ("i2c: designware: Avoid unnecessary resuming
    during system suspend"), may suggest to the PM core to try out the so
    called direct_complete path for system sleep. In this path, the PM core
    treats a runtime suspended device as it's already in a proper low power
    state for system sleep, which makes it skip calling the system sleep
    callbacks for the device, except for the ->prepare() and the ->complete()
    callbacks.
    
    However, the PM core may unset the direct_complete flag for a parent
    device, in case its child device are being system suspended before. In this
    scenario, the PM core invokes the system sleep callbacks, no matter if the
    device is runtime suspended or not.
    
    Particularly in cases of an existing i2c slave device, the above path is
    triggered, which breaks the assumption that the i2c device is always
    runtime resumed whenever the dw_i2c_plat_suspend() is being called.
    
    More precisely, dw_i2c_plat_suspend() calls clk_core_disable() and
    clk_core_unprepare(), for an already disabled/unprepared clock, leading to
    a splat in the log about clocks calls being wrongly balanced and breaking
    system sleep.
    
    To still allow the direct_complete path in cases when it's possible, but
    also to keep the fix simple, let's runtime resume the i2c device in the
    ->suspend() callback, before continuing to put the device into low power
    state.
    
    Note, in cases when the i2c device is attached to the ACPI PM domain, this
    problem doesn't occur, because ACPI's ->suspend() callback, assigned to
    acpi_subsys_suspend(), already calls pm_runtime_resume() for the device.
    
    It should also be noted that this change does not fix commit 8503ff166504
    ("i2c: designware: Avoid unnecessary resuming during system suspend").
    Because for the non-ACPI case, the system sleep support was already broken
    prior that point.
    
    Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
    Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Tested-by: John Stultz <john.stultz@linaro.org>
    Tested-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
    Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
    Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
    Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 3a9495fd371909dde9829aefc8389b967280303d
Author: Ross Zwisler <ross.zwisler@linux.intel.com>
Date:   Fri Aug 25 15:55:36 2017 -0700

    dax: fix deadlock due to misaligned PMD faults
    
    commit fffa281b48a91ad6dac1a18c5907ece58fa3879b upstream.
    
    In DAX there are two separate places where the 2MiB range of a PMD is
    defined.
    
    The first is in the page tables, where a PMD mapping inserted for a
    given address spans from (vmf->address & PMD_MASK) to ((vmf->address &
    PMD_MASK) + PMD_SIZE - 1).  That is, from the 2MiB boundary below the
    address to the 2MiB boundary above the address.
    
    So, for example, a fault at address 3MiB (0x30 0000) falls within the
    PMD that ranges from 2MiB (0x20 0000) to 4MiB (0x40 0000).
    
    The second PMD range is in the mapping->page_tree, where a given file
    offset is covered by a radix tree entry that spans from one 2MiB aligned
    file offset to another 2MiB aligned file offset.
    
    So, for example, the file offset for 3MiB (pgoff 768) falls within the
    PMD range for the order 9 radix tree entry that ranges from 2MiB (pgoff
    512) to 4MiB (pgoff 1024).
    
    This system works so long as the addresses and file offsets for a given
    mapping both have the same offsets relative to the start of each PMD.
    
    Consider the case where the starting address for a given file isn't 2MiB
    aligned - say our faulting address is 3 MiB (0x30 0000), but that
    corresponds to the beginning of our file (pgoff 0).  Now all the PMDs in
    the mapping are misaligned so that the 2MiB range defined in the page
    tables never matches up with the 2MiB range defined in the radix tree.
    
    The current code notices this case for DAX faults to storage with the
    following test in dax_pmd_insert_mapping():
    
            if (pfn_t_to_pfn(pfn) & PG_PMD_COLOUR)
                    goto unlock_fallback;
    
    This test makes sure that the pfn we get from the driver is 2MiB
    aligned, and relies on the assumption that the 2MiB alignment of the pfn
    we get back from the driver matches the 2MiB alignment of the faulting
    address.
    
    However, faults to holes were not checked and we could hit the problem
    described above.
    
    This was reported in response to the NVML nvml/src/test/pmempool_sync
    TEST5:
    
            $ cd nvml/src/test/pmempool_sync
            $ make TEST5
    
    You can grab NVML here:
    
            https://github.com/pmem/nvml/
    
    The dmesg warning you see when you hit this error is:
    
      WARNING: CPU: 13 PID: 2900 at fs/dax.c:641 dax_insert_mapping_entry+0x2df/0x310
    
    Where we notice in dax_insert_mapping_entry() that the radix tree entry
    we are about to replace doesn't match the locked entry that we had
    previously inserted into the tree.  This happens because the initial
    insertion was done in grab_mapping_entry() using a pgoff calculated from
    the faulting address (vmf->address), and the replacement in
    dax_pmd_load_hole() => dax_insert_mapping_entry() is done using
    vmf->pgoff.
    
    In our failure case those two page offsets (one calculated from
    vmf->address, one using vmf->pgoff) point to different order 9 radix
    tree entries.
    
    This failure case can result in a deadlock because the radix tree unlock
    also happens on the pgoff calculated from vmf->address.  This means that
    the locked radix tree entry that we swapped in to the tree in
    dax_insert_mapping_entry() using vmf->pgoff is never unlocked, so all
    future faults to that 2MiB range will block forever.
    
    Fix this by validating that the faulting address's PMD offset matches
    the PMD offset from the start of the file.  This check is done at the
    very beginning of the fault and covers faults that would have mapped to
    storage as well as faults to holes.  I left the COLOUR check in
    dax_pmd_insert_mapping() in place in case we ever hit the insanity
    condition where the alignment of the pfn we get from the driver doesn't
    match the alignment of the userspace address.
    
    Link: http://lkml.kernel.org/r/20170822222436.18926-1-ross.zwisler@linux.intel.com
    Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
    Reported-by: "Slusarz, Marcin" <marcin.slusarz@intel.com>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Cc: Alexander Viro <viro@zeniv.linux.org.uk>
    Cc: Christoph Hellwig <hch@lst.de>
    Cc: Dan Williams <dan.j.williams@intel.com>
    Cc: Dave Chinner <david@fromorbit.com>
    Cc: Matthew Wilcox <mawilcox@microsoft.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 735a252fc5b8558a01ceacdda44a74486e0f45e8
Author: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Date:   Fri Aug 25 15:55:33 2017 -0700

    mm, shmem: fix handling /sys/kernel/mm/transparent_hugepage/shmem_enabled
    
    commit 435c0b87d661da83771c30ed775f7c37eed193fb upstream.
    
    /sys/kernel/mm/transparent_hugepage/shmem_enabled controls if we want
    to allocate huge pages when allocate pages for private in-kernel shmem
    mount.
    
    Unfortunately, as Dan noticed, I've screwed it up and the only way to
    make kernel allocate huge page for the mount is to use "force" there.
    All other values will be effectively ignored.
    
    Link: http://lkml.kernel.org/r/20170822144254.66431-1-kirill.shutemov@linux.intel.com
    Fixes: 5a6e75f8110c ("shmem: prepare huge= mount option and sysfs knob")
    Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
    Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit b2719637b16eac8a652b57551c9af47dd3c3c134
Author: Chen Yu <yu.c.chen@intel.com>
Date:   Fri Aug 25 15:55:30 2017 -0700

    PM/hibernate: touch NMI watchdog when creating snapshot
    
    commit 556b969a1cfe2686aae149137fa1dfcac0eefe54 upstream.
    
    There is a problem that when counting the pages for creating the
    hibernation snapshot will take significant amount of time, especially on
    system with large memory.  Since the counting job is performed with irq
    disabled, this might lead to NMI lockup.  The following warning were
    found on a system with 1.5TB DRAM:
    
      Freezing user space processes ... (elapsed 0.002 seconds) done.
      OOM killer disabled.
      PM: Preallocating image memory...
      NMI watchdog: Watchdog detected hard LOCKUP on cpu 27
      CPU: 27 PID: 3128 Comm: systemd-sleep Not tainted 4.13.0-0.rc2.git0.1.fc27.x86_64 #1
      task: ffff9f01971ac000 task.stack: ffffb1a3f325c000
      RIP: 0010:memory_bm_find_bit+0xf4/0x100
      Call Trace:
       swsusp_set_page_free+0x2b/0x30
       mark_free_pages+0x147/0x1c0
       count_data_pages+0x41/0xa0
       hibernate_preallocate_memory+0x80/0x450
       hibernation_snapshot+0x58/0x410
       hibernate+0x17c/0x310
       state_store+0xdf/0xf0
       kobj_attr_store+0xf/0x20
       sysfs_kf_write+0x37/0x40
       kernfs_fop_write+0x11c/0x1a0
       __vfs_write+0x37/0x170
       vfs_write+0xb1/0x1a0
       SyS_write+0x55/0xc0
       entry_SYSCALL_64_fastpath+0x1a/0xa5
      ...
      done (allocated 6590003 pages)
      PM: Allocated 26360012 kbytes in 19.89 seconds (1325.28 MB/s)
    
    It has taken nearly 20 seconds(2.10GHz CPU) thus the NMI lockup was
    triggered.  In case the timeout of the NMI watch dog has been set to 1
    second, a safe interval should be 6590003/20 = 320k pages in theory.
    However there might also be some platforms running at a lower frequency,
    so feed the watchdog every 100k pages.
    
    [yu.c.chen@intel.com: simplification]
      Link: http://lkml.kernel.org/r/1503460079-29721-1-git-send-email-yu.c.chen@intel.com
    [yu.c.chen@intel.com: use interval of 128k instead of 100k to avoid modulus]
    Link: http://lkml.kernel.org/r/1503328098-5120-1-git-send-email-yu.c.chen@intel.com
    Signed-off-by: Chen Yu <yu.c.chen@intel.com>
    Reported-by: Jan Filipcewicz <jan.filipcewicz@intel.com>
    Suggested-by: Michal Hocko <mhocko@suse.com>
    Reviewed-by: Michal Hocko <mhocko@suse.com>
    Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Cc: Len Brown <lenb@kernel.org>
    Cc: Dan Williams <dan.j.williams@intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 8b366972d7d4b553851c0f176a7224309bafeb6b
Author: Vineet Gupta <vgupta@synopsys.com>
Date:   Thu Aug 3 17:45:44 2017 +0530

    ARCv2: PAE40: set MSB even if !CONFIG_ARC_HAS_PAE40 but PAE exists in SoC
    
    commit b5ddb6d54729d814356937572d6c9b599f10c29f upstream.
    
    PAE40 confiuration in hardware extends some of the address registers
    for TLB/cache ops to 2 words.
    
    So far kernel was NOT setting the higher word if feature was not enabled
    in software which is wrong. Those need to be set to 0 in such case.
    
    Normally this would be done in the cache flush / tlb ops, however since
    these registers only exist conditionally, this would have to be
    conditional to a flag being set on boot which is expensive/ugly -
    specially for the more common case of PAE exists but not in use.
    Optimize that by zero'ing them once at boot - nobody will write to
    them afterwards
    
    Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit fcedf2f285706bb478e40feef463ee306557d8cb
Author: Alexey Brodkin <Alexey.Brodkin@synopsys.com>
Date:   Tue Aug 1 12:58:47 2017 +0300

    ARCv2: PAE40: Explicitly set MSB counterpart of SLC region ops addresses
    
    commit 7d79cee2c6540ea64dd917a14e2fd63d4ac3d3c0 upstream.
    
    It is necessary to explicitly set both SLC_AUX_RGN_START1 and SLC_AUX_RGN_END1
    which hold MSB bits of the physical address correspondingly of region start
    and end otherwise SLC region operation is executed in unpredictable manner
    
    Without this patch, SLC flushes on HSDK (IOC disabled) were taking
    seconds.
    
    Reported-by: Vladimir Kondratiev <vladimir.kondratiev@intel.com>
    Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
    Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
    [vgupta: PAR40 regs only written if PAE40 exist]
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 763ad31728e81aeb6f0dc49d5ddf8503670e1c55
Author: Alexey Brodkin <abrodkin@synopsys.com>
Date:   Fri Jul 7 12:25:14 2017 +0300

    ARCv2: SLC: Make sure busy bit is set properly for region ops
    
    commit b37174d95b0251611a80ef60abf03752e9d66d67 upstream.
    
    c70c473396cb "ARCv2: SLC: Make sure busy bit is set properly on SLC flushing"
    fixes problem for entire SLC operation where the problem was initially
    caught. But given a nature of the issue it is perfectly possible for
    busy bit to be read incorrectly even when region operation was started.
    
    So extending initial fix for regional operation as well.
    
    Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
    Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 8537b1e0ff7f502c6a790605eea21d36a82457c5
Author: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Date:   Sun Aug 20 15:55:02 2017 +0900

    ALSA: firewire-motu: destroy stream data surely at failure of card initialization
    
    commit dbd7396b4f24e0c3284fcc05f5def24f52c09884 upstream.
    
    When failing sound card registration after initializing stream data, this
    module leaves allocated data in stream data. This commit fixes the bug.
    
    Fixes: 9b2bb4f2f4a2 ('ALSA: firewire-motu: add stream management functionality')
    Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 59d000610dc3dcd87dc3e67a9f1954518a0ad879
Author: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Date:   Sun Aug 20 15:54:26 2017 +0900

    ALSA: firewire: fix NULL pointer dereference when releasing uninitialized data of iso-resource
    
    commit 0c264af7be2013266c5b4c644f3f366399ee490a upstream.
    
    When calling 'iso_resource_free()' for uninitialized data, this function
    causes NULL pointer dereference due to its 'unit' member. This occurs when
    unplugging audio and music units on IEEE 1394 bus at failure of card
    registration.
    
    This commit fixes the bug. The bug exists since kernel v4.5.
    
    Fixes: 324540c4e05c ('ALSA: fireface: postpone sound card registration') at v4.12
    Fixes: 8865a31e0fd8 ('ALSA: firewire-motu: postpone sound card registration') at v4.12
    Fixes: b610386c8afb ('ALSA: firewire-tascam: deleyed registration of sound card') at v4.7
    Fixes: 86c8dd7f4da3 ('ALSA: firewire-digi00x: delayed registration of sound card') at v4.7
    Fixes: 6c29230e2a5f ('ALSA: oxfw: delayed registration of sound card') at v4.7
    Fixes: 7d3c1d5901aa ('ALSA: fireworks: delayed registration of sound card') at v4.7
    Fixes: 04a2c73c97eb ('ALSA: bebob: delayed registration of sound card') at v4.7
    Fixes: b59fb1900b4f ('ALSA: dice: postpone card registration') at v4.5
    Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 2f45c61ba400eb137512a75654019cf4adeb4c9f
Author: Takashi Iwai <tiwai@suse.de>
Date:   Wed Aug 23 09:30:17 2017 +0200

    ALSA: hda - Add stereo mic quirk for Lenovo G50-70 (17aa:3978)
    
    commit bbba6f9d3da357bbabc6fda81e99ff5584500e76 upstream.
    
    Lenovo G50-70 (17aa:3978) with Conexant codec chip requires the
    similar workaround for the inverted stereo dmic like other Lenovo
    models.
    
    Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1020657
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit ba6b08b62f0cc989a9b9465bf5bdca2b68ba43d4
Author: Takashi Iwai <tiwai@suse.de>
Date:   Tue Aug 22 08:15:13 2017 +0200

    ALSA: core: Fix unexpected error at replacing user TLV
    
    commit 88c54cdf61f508ebcf8da2d819f5dfc03e954d1d upstream.
    
    When user tries to replace the user-defined control TLV, the kernel
    checks the change of its content via memcmp().  The problem is that
    the kernel passes the return value from memcmp() as is.  memcmp()
    gives a non-zero negative value depending on the comparison result,
    and this shall be recognized as an error code.
    
    The patch covers that corner-case, return 1 properly for the changed
    TLV.
    
    Fixes: 8aa9b586e420 ("[ALSA] Control API - more robust TLV implementation")
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 1157dcda136aa1ad408f7dbc4aa385639ee783e2
Author: Joakim Tjernlund <joakim.tjernlund@infinera.com>
Date:   Tue Aug 22 08:33:53 2017 +0200

    ALSA: usb-audio: Add delay quirk for H650e/Jabra 550a USB headsets
    
    commit 07b3b5e9ed807a0d2077319b8e43a42e941db818 upstream.
    
    These headsets reports a lot of: cannot set freq 44100 to ep 0x81
    and need a small delay between sample rate settings, just like
    Zoom R16/24. Add both headsets to the Zoom R16/24 quirk for
    a 1 ms delay between control msgs.
    
    Signed-off-by: Joakim Tjernlund <joakim.tjernlund@infinera.com>
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 2f76f62aef900c1d344c0c4d33d9652451b5c31a
Author: Paolo Bonzini <pbonzini@redhat.com>
Date:   Thu Aug 24 11:59:31 2017 +0200

    KVM: x86: block guest protection keys unless the host has them enabled
    
    commit c469268cd523245cc58255f6696e0c295485cb0b upstream.
    
    If the host has protection keys disabled, we cannot read and write the
    guest PKRU---RDPKRU and WRPKRU fail with #GP(0) if CR4.PKE=0.  Block
    the PKU cpuid bit in that case.
    
    This ensures that guest_CR4.PKE=1 implies host_CR4.PKE=1.
    
    Fixes: 1be0e61c1f255faaeab04a390e00c8b9b9042870
    Reviewed-by: David Hildenbrand <david@redhat.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 3c498d4bde8886f38ed3f4596caf23c0ecc12b87
Author: Paolo Bonzini <pbonzini@redhat.com>
Date:   Wed Aug 23 23:16:29 2017 +0200

    KVM, pkeys: do not use PKRU value in vcpu->arch.guest_fpu.state
    
    commit 38cfd5e3df9c4f88e76b547eee2087ee5c042ae2 upstream.
    
    The host pkru is restored right after vcpu exit (commit 1be0e61), so
    KVM_GET_XSAVE will return the host PKRU value instead.  Fix this by
    using the guest PKRU explicitly in fill_xsave and load_xsave.  This
    part is based on a patch by Junkang Fu.
    
    The host PKRU data may also not match the value in vcpu->arch.guest_fpu.state,
    because it could have been changed by userspace since the last time
    it was saved, so skip loading it in kvm_load_guest_fpu.
    
    Reported-by: Junkang Fu <junkang.fjk@alibaba-inc.com>
    Cc: Yang Zhang <zy107165@alibaba-inc.com>
    Fixes: 1be0e61c1f255faaeab04a390e00c8b9b9042870
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit d0e52c825f0022add8c6645de99f5d7766537193
Author: Paolo Bonzini <pbonzini@redhat.com>
Date:   Wed Aug 23 23:14:38 2017 +0200

    KVM: x86: simplify handling of PKRU
    
    commit b9dd21e104bcd45e124acfe978a79df71259e59b upstream.
    
    Move it to struct kvm_arch_vcpu, replacing guest_pkru_valid with a
    simple comparison against the host value of the register.  The write of
    PKRU in addition can be skipped if the guest has not enabled the feature.
    Once we do this, we need not test OSPKE in the host anymore, because
    guest_CR4.PKE=1 implies host_CR4.PKE=1.
    
    The static PKU test is kept to elide the code on older CPUs.
    
    Suggested-by: Yang Zhang <zy107165@alibaba-inc.com>
    Fixes: 1be0e61c1f255faaeab04a390e00c8b9b9042870
    Reviewed-by: David Hildenbrand <david@redhat.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 6dc06cd600d039884e63f728fbd52844dc541375
Author: Heiko Carstens <heiko.carstens@de.ibm.com>
Date:   Thu Aug 3 14:27:30 2017 +0200

    KVM: s390: sthyi: fix specification exception detection
    
    commit 857b8de96795646c5891cf44ae6fb19b9ff74bf9 upstream.
    
    sthyi should only generate a specification exception if the function
    code is zero and the response buffer is not on a 4k boundary.
    
    The current code would also test for unknown function codes if the
    response buffer, that is currently only defined for function code 0,
    is not on a 4k boundary and incorrectly inject a specification
    exception instead of returning with condition code 3 and return code 4
    (unsupported function code).
    
    Fix this by moving the boundary check.
    
    Fixes: 95ca2cb57985 ("KVM: s390: Add sthyi emulation")
    Reviewed-by: Janosch Frank <frankja@linux.vnet.ibm.com>
    Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
    Reviewed-by: David Hildenbrand <david@redhat.com>
    Reviewed-by: Cornelia Huck <cohuck@redhat.com>
    Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit e516834ae81be22a833b36be55ed4a95606caec9
Author: Heiko Carstens <heiko.carstens@de.ibm.com>
Date:   Thu Aug 3 13:05:11 2017 +0200

    KVM: s390: sthyi: fix sthyi inline assembly
    
    commit 4a4eefcd0e49f9f339933324c1bde431186a0a7d upstream.
    
    The sthyi inline assembly misses register r3 within the clobber
    list. The sthyi instruction will always write a return code to
    register "R2+1", which in this case would be r3. Due to that we may
    have register corruption and see host crashes or data corruption
    depending on how gcc decided to allocate and use registers during
    compile time.
    
    Fixes: 95ca2cb57985 ("KVM: s390: Add sthyi emulation")
    Reviewed-by: Janosch Frank <frankja@linux.vnet.ibm.com>
    Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
    Reviewed-by: David Hildenbrand <david@redhat.com>
    Reviewed-by: Cornelia Huck <cohuck@redhat.com>
    Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit ddae9e6ec5d57397aa2d0225c62c58794f0e12e2
Author: Masaki Ota <masaki.ota@jp.alps.com>
Date:   Thu Aug 24 15:44:36 2017 -0700

    Input: ALPS - fix two-finger scroll breakage in right side on ALPS touchpad
    
    commit 4a646580f793d19717f7e034c8d473b509c27d49 upstream.
    
    Fixed the issue that two finger scroll does not work correctly
    on V8 protocol. The cause is that V8 protocol X-coordinate decode
    is wrong at SS4 PLUS device. I added SS4 PLUS X decode definition.
    
    Mote notes:
    the problem manifests itself by the commit e7348396c6d5 ("Input: ALPS
    - fix V8+ protocol handling (73 03 28)"), where a fix for the V8+
    protocol was applied.  Although the culprit must have been present
    beforehand, the two-finger scroll worked casually even with the
    wrongly reported values by some reason.  It got broken by the commit
    above just because it changed x_max value, and this made libinput
    correctly figuring the MT events.  Since the X coord is reported as
    falsely doubled, the events on the right-half side go outside the
    boundary, thus they are no longer handled.  This resulted as a broken
    two-finger scroll.
    
    One finger event is decoded differently, and it didn't suffer from
    this problem.  The problem was only about MT events. --tiwai
    
    Fixes: e7348396c6d5 ("Input: ALPS - fix V8+ protocol handling (73 03 28)")
    Signed-off-by: Masaki Ota <masaki.ota@jp.alps.com>
    Tested-by: Takashi Iwai <tiwai@suse.de>
    Tested-by: Paul Donohue <linux-kernel@PaulSD.com>
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 8dcee8e81a0e8ad502b7e09e4e3dd39ced09f4c8
Author: KT Liao <kt.liao@emc.com.tw>
Date:   Fri Aug 18 16:58:15 2017 -0700

    Input: elan_i2c - add ELAN0602 ACPI ID to support Lenovo Yoga310
    
    commit 1d2226e45040ed4aee95b633cbd64702bf7fc2a1 upstream.
    
    Add ELAN0602 to the list of known ACPI IDs to enable support for ELAN
    touchpads found in Lenovo Yoga310.
    
    Signed-off-by: KT Liao <kt.liao@emc.com.tw>
    Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 38c36f9d1fca5b5e6d2d1be13ab82d1ce714e4b1
Author: Aaron Ma <aaron.ma@canonical.com>
Date:   Fri Aug 18 12:17:21 2017 -0700

    Input: trackpoint - add new trackpoint firmware ID
    
    commit ec667683c532c93fb41e100e5d61a518971060e2 upstream.
    
    Synaptics add new TP firmware ID: 0x2 and 0x3, for now both lower 2 bits
    are indicated as TP. Change the constant to bitwise values.
    
    This makes trackpoint to be recognized on Lenovo Carbon X1 Gen5 instead
    of it being identified as "PS/2 Generic Mouse".
    
    Signed-off-by: Aaron Ma <aaron.ma@canonical.com>
    Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit c9c682f3f029d3220a05c8d69ef9a022a93974fc
Author: Edward Cree <ecree@solarflare.com>
Date:   Fri Jul 21 14:37:34 2017 +0100

    bpf/verifier: fix min/max handling in BPF_SUB
    
    
    [ Upstream commit 9305706c2e808ae59f1eb201867f82f1ddf6d7a6 ]
    
    We have to subtract the src max from the dst min, and vice-versa, since
     (e.g.) the smallest result comes from the largest subtrahend.
    
    Fixes: 484611357c19 ("bpf: allow access into map value arrays")
    Signed-off-by: Edward Cree <ecree@solarflare.com>
    Acked-by: Daniel Borkmann <daniel@iogearbox.net>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit eb6cf01cd6b7630d51df66d8f5fa28ef250372bc
Author: Daniel Borkmann <daniel@iogearbox.net>
Date:   Fri Jul 21 00:00:21 2017 +0200

    bpf: fix mixed signed/unsigned derived min/max value bounds
    
    
    [ Upstream commit 4cabc5b186b5427b9ee5a7495172542af105f02b ]
    
    Edward reported that there's an issue in min/max value bounds
    tracking when signed and unsigned compares both provide hints
    on limits when having unknown variables. E.g. a program such
    as the following should have been rejected:
    
       0: (7a) *(u64 *)(r10 -8) = 0
       1: (bf) r2 = r10
       2: (07) r2 += -8
       3: (18) r1 = 0xffff8a94cda93400
       5: (85) call bpf_map_lookup_elem#1
       6: (15) if r0 == 0x0 goto pc+7
      R0=map_value(ks=8,vs=8,id=0),min_value=0,max_value=0 R10=fp
       7: (7a) *(u64 *)(r10 -16) = -8
       8: (79) r1 = *(u64 *)(r10 -16)
       9: (b7) r2 = -1
      10: (2d) if r1 > r2 goto pc+3
      R0=map_value(ks=8,vs=8,id=0),min_value=0,max_value=0 R1=inv,min_value=0
      R2=imm-1,max_value=18446744073709551615,min_align=1 R10=fp
      11: (65) if r1 s> 0x1 goto pc+2
      R0=map_value(ks=8,vs=8,id=0),min_value=0,max_value=0 R1=inv,min_value=0,max_value=1
      R2=imm-1,max_value=18446744073709551615,min_align=1 R10=fp
      12: (0f) r0 += r1
      13: (72) *(u8 *)(r0 +0) = 0
      R0=map_value_adj(ks=8,vs=8,id=0),min_value=0,max_value=1 R1=inv,min_value=0,max_value=1
      R2=imm-1,max_value=18446744073709551615,min_align=1 R10=fp
      14: (b7) r0 = 0
      15: (95) exit
    
    What happens is that in the first part ...
    
       8: (79) r1 = *(u64 *)(r10 -16)
       9: (b7) r2 = -1
      10: (2d) if r1 > r2 goto pc+3
    
    ... r1 carries an unsigned value, and is compared as unsigned
    against a register carrying an immediate. Verifier deduces in
    reg_set_min_max() that since the compare is unsigned and operation
    is greater than (>), that in the fall-through/false case, r1's
    minimum bound must be 0 and maximum bound must be r2. Latter is
    larger than the bound and thus max value is reset back to being
    'invalid' aka BPF_REGISTER_MAX_RANGE. Thus, r1 state is now
    'R1=inv,min_value=0'. The subsequent test ...
    
      11: (65) if r1 s> 0x1 goto pc+2
    
    ... is a signed compare of r1 with immediate value 1. Here,
    verifier deduces in reg_set_min_max() that since the compare
    is signed this time and operation is greater than (>), that
    in the fall-through/false case, we can deduce that r1's maximum
    bound must be 1, meaning with prior test, we result in r1 having
    the following state: R1=inv,min_value=0,max_value=1. Given that
    the actual value this holds is -8, the bounds are wrongly deduced.
    When this is being added to r0 which holds the map_value(_adj)
    type, then subsequent store access in above case will go through
    check_mem_access() which invokes check_map_access_adj(), that
    will then probe whether the map memory is in bounds based
    on the min_value and max_value as well as access size since
    the actual unknown value is min_value <= x <= max_value; commit
    fce366a9dd0d ("bpf, verifier: fix alu ops against map_value{,
    _adj} register types") provides some more explanation on the
    semantics.
    
    It's worth to note in this context that in the current code,
    min_value and max_value tracking are used for two things, i)
    dynamic map value access via check_map_access_adj() and since
    commit 06c1c049721a ("bpf: allow helpers access to variable memory")
    ii) also enforced at check_helper_mem_access() when passing a
    memory address (pointer to packet, map value, stack) and length
    pair to a helper and the length in this case is an unknown value
    defining an access range through min_value/max_value in that
    case. The min_value/max_value tracking is /not/ used in the
    direct packet access case to track ranges. However, the issue
    also affects case ii), for example, the following crafted program
    based on the same principle must be rejected as well:
    
       0: (b7) r2 = 0
       1: (bf) r3 = r10
       2: (07) r3 += -512
       3: (7a) *(u64 *)(r10 -16) = -8
       4: (79) r4 = *(u64 *)(r10 -16)
       5: (b7) r6 = -1
       6: (2d) if r4 > r6 goto pc+5
      R1=ctx R2=imm0,min_value=0,max_value=0,min_align=2147483648 R3=fp-512
      R4=inv,min_value=0 R6=imm-1,max_value=18446744073709551615,min_align=1 R10=fp
       7: (65) if r4 s> 0x1 goto pc+4
      R1=ctx R2=imm0,min_value=0,max_value=0,min_align=2147483648 R3=fp-512
      R4=inv,min_value=0,max_value=1 R6=imm-1,max_value=18446744073709551615,min_align=1
      R10=fp
       8: (07) r4 += 1
       9: (b7) r5 = 0
      10: (6a) *(u16 *)(r10 -512) = 0
      11: (85) call bpf_skb_load_bytes#26
      12: (b7) r0 = 0
      13: (95) exit
    
    Meaning, while we initialize the max_value stack slot that the
    verifier thinks we access in the [1,2] range, in reality we
    pass -7 as length which is interpreted as u32 in the helper.
    Thus, this issue is relevant also for the case of helper ranges.
    Resetting both bounds in check_reg_overflow() in case only one
    of them exceeds limits is also not enough as similar test can be
    created that uses values which are within range, thus also here
    learned min value in r1 is incorrect when mixed with later signed
    test to create a range:
    
       0: (7a) *(u64 *)(r10 -8) = 0
       1: (bf) r2 = r10
       2: (07) r2 += -8
       3: (18) r1 = 0xffff880ad081fa00
       5: (85) call bpf_map_lookup_elem#1
       6: (15) if r0 == 0x0 goto pc+7
      R0=map_value(ks=8,vs=8,id=0),min_value=0,max_value=0 R10=fp
       7: (7a) *(u64 *)(r10 -16) = -8
       8: (79) r1 = *(u64 *)(r10 -16)
       9: (b7) r2 = 2
      10: (3d) if r2 >= r1 goto pc+3
      R0=map_value(ks=8,vs=8,id=0),min_value=0,max_value=0 R1=inv,min_value=3
      R2=imm2,min_value=2,max_value=2,min_align=2 R10=fp
      11: (65) if r1 s> 0x4 goto pc+2
      R0=map_value(ks=8,vs=8,id=0),min_value=0,max_value=0
      R1=inv,min_value=3,max_value=4 R2=imm2,min_value=2,max_value=2,min_align=2 R10=fp
      12: (0f) r0 += r1
      13: (72) *(u8 *)(r0 +0) = 0
      R0=map_value_adj(ks=8,vs=8,id=0),min_value=3,max_value=4
      R1=inv,min_value=3,max_value=4 R2=imm2,min_value=2,max_value=2,min_align=2 R10=fp
      14: (b7) r0 = 0
      15: (95) exit
    
    This leaves us with two options for fixing this: i) to invalidate
    all prior learned information once we switch signed context, ii)
    to track min/max signed and unsigned boundaries separately as
    done in [0]. (Given latter introduces major changes throughout
    the whole verifier, it's rather net-next material, thus this
    patch follows option i), meaning we can derive bounds either
    from only signed tests or only unsigned tests.) There is still the
    case of adjust_reg_min_max_vals(), where we adjust bounds on ALU
    operations, meaning programs like the following where boundaries
    on the reg get mixed in context later on when bounds are merged
    on the dst reg must get rejected, too:
    
       0: (7a) *(u64 *)(r10 -8) = 0
       1: (bf) r2 = r10
       2: (07) r2 += -8
       3: (18) r1 = 0xffff89b2bf87ce00
       5: (85) call bpf_map_lookup_elem#1
       6: (15) if r0 == 0x0 goto pc+6
      R0=map_value(ks=8,vs=8,id=0),min_value=0,max_value=0 R10=fp
       7: (7a) *(u64 *)(r10 -16) = -8
       8: (79) r1 = *(u64 *)(r10 -16)
       9: (b7) r2 = 2
      10: (3d) if r2 >= r1 goto pc+2
      R0=map_value(ks=8,vs=8,id=0),min_value=0,max_value=0 R1=inv,min_value=3
      R2=imm2,min_value=2,max_value=2,min_align=2 R10=fp
      11: (b7) r7 = 1
      12: (65) if r7 s> 0x0 goto pc+2
      R0=map_value(ks=8,vs=8,id=0),min_value=0,max_value=0 R1=inv,min_value=3
      R2=imm2,min_value=2,max_value=2,min_align=2 R7=imm1,max_value=0 R10=fp
      13: (b7) r0 = 0
      14: (95) exit
    
      from 12 to 15: R0=map_value(ks=8,vs=8,id=0),min_value=0,max_value=0
      R1=inv,min_value=3 R2=imm2,min_value=2,max_value=2,min_align=2 R7=imm1,min_value=1 R10=fp
      15: (0f) r7 += r1
      16: (65) if r7 s> 0x4 goto pc+2
      R0=map_value(ks=8,vs=8,id=0),min_value=0,max_value=0 R1=inv,min_value=3
      R2=imm2,min_value=2,max_value=2,min_align=2 R7=inv,min_value=4,max_value=4 R10=fp
      17: (0f) r0 += r7
      18: (72) *(u8 *)(r0 +0) = 0
      R0=map_value_adj(ks=8,vs=8,id=0),min_value=4,max_value=4 R1=inv,min_value=3
      R2=imm2,min_value=2,max_value=2,min_align=2 R7=inv,min_value=4,max_value=4 R10=fp
      19: (b7) r0 = 0
      20: (95) exit
    
    Meaning, in adjust_reg_min_max_vals() we must also reset range
    values on the dst when src/dst registers have mixed signed/
    unsigned derived min/max value bounds with one unbounded value
    as otherwise they can be added together deducing false boundaries.
    Once both boundaries are established from either ALU ops or
    compare operations w/o mixing signed/unsigned insns, then they
    can safely be added to other regs also having both boundaries
    established. Adding regs with one unbounded side to a map value
    where the bounded side has been learned w/o mixing ops is
    possible, but the resulting map value won't recover from that,
    meaning such op is considered invalid on the time of actual
    access. Invalid bounds are set on the dst reg in case i) src reg,
    or ii) in case dst reg already had them. The only way to recover
    would be to perform i) ALU ops but only 'add' is allowed on map
    value types or ii) comparisons, but these are disallowed on
    pointers in case they span a range. This is fine as only BPF_JEQ
    and BPF_JNE may be performed on PTR_TO_MAP_VALUE_OR_NULL registers
    which potentially turn them into PTR_TO_MAP_VALUE type depending
    on the branch, so only here min/max value cannot be invalidated
    for them.
    
    In terms of state pruning, value_from_signed is considered
    as well in states_equal() when dealing with adjusted map values.
    With regards to breaking existing programs, there is a small
    risk, but use-cases are rather quite narrow where this could
    occur and mixing compares probably unlikely.
    
    Joint work with Josef and Edward.
    
      [0] https://lists.iovisor.org/pipermail/iovisor-dev/2017-June/000822.html
    
    Fixes: 484611357c19 ("bpf: allow access into map value arrays")
    Reported-by: Edward Cree <ecree@solarflare.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Signed-off-by: Edward Cree <ecree@solarflare.com>
    Signed-off-by: Josef Bacik <jbacik@fb.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 659ee9687a11b70aab693b06f1214236920a23ee
Author: John Fastabend <john.fastabend@gmail.com>
Date:   Sun Jul 2 02:13:30 2017 +0200

    bpf, verifier: add additional patterns to evaluate_reg_imm_alu
    
    
    [ Upstream commit 43188702b3d98d2792969a3377a30957f05695e6 ]
    
    Currently the verifier does not track imm across alu operations when
    the source register is of unknown type. This adds additional pattern
    matching to catch this and track imm. We've seen LLVM generating this
    pattern while working on cilium.
    
    Signed-off-by: John Fastabend <john.fastabend@gmail.com>
    Acked-by: Daniel Borkmann <daniel@iogearbox.net>
    Acked-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit d8a4ae09809a7fea7e2505fc4cc89695c744cf12
Author: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Date:   Sat Aug 19 15:37:07 2017 +0300

    net_sched: fix order of queue length updates in qdisc_replace()
    
    
    [ Upstream commit 68a66d149a8c78ec6720f268597302883e48e9fa ]
    
    This important to call qdisc_tree_reduce_backlog() after changing queue
    length. Parent qdisc should deactivate class in ->qlen_notify() called from
    qdisc_tree_reduce_backlog() but this happens only if qdisc->q.qlen in zero.
    
    Missed class deactivations leads to crashes/warnings at picking packets
    from empty qdisc and corrupting state at reactivating this class in future.
    
    Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
    Fixes: 86a7996cc8a0 ("net_sched: introduce qdisc_replace() helper")
    Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 09e1d36d0289f9ee965c4550d124543a83090412
Author: Xin Long <lucien.xin@gmail.com>
Date:   Fri Aug 18 11:01:36 2017 +0800

    net: sched: fix NULL pointer dereference when action calls some targets
    
    
    [ Upstream commit 4f8a881acc9d1adaf1e552349a0b1df28933a04c ]
    
    As we know in some target's checkentry it may dereference par.entryinfo
    to check entry stuff inside. But when sched action calls xt_check_target,
    par.entryinfo is set with NULL. It would cause kernel panic when calling
    some targets.
    
    It can be reproduce with:
      # tc qd add dev eth1 ingress handle ffff:
      # tc filter add dev eth1 parent ffff: u32 match u32 0 0 action xt \
        -j ECN --ecn-tcp-remove
    
    It could also crash kernel when using target CLUSTERIP or TPROXY.
    
    By now there's no proper value for par.entryinfo in ipt_init_target,
    but it can not be set with NULL. This patch is to void all these
    panics by setting it with an ipt_entry obj with all members = 0.
    
    Note that this issue has been there since the very beginning.
    
    Signed-off-by: Xin Long <lucien.xin@gmail.com>
    Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit f4e4a29699194e57e3545eabe0659f207a8f3ef3
Author: Colin Ian King <colin.king@canonical.com>
Date:   Thu Aug 17 23:14:58 2017 +0100

    irda: do not leak initialized list.dev to userspace
    
    
    [ Upstream commit b024d949a3c24255a7ef1a470420eb478949aa4c ]
    
    list.dev has not been initialized and so the copy_to_user is copying
    data from the stack back to user space which is a potential
    information leak. Fix this ensuring all of list is initialized to
    zero.
    
    Detected by CoverityScan, CID#1357894 ("Uninitialized scalar variable")
    
    Signed-off-by: Colin Ian King <colin.king@canonical.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 754df4da61d1bcd16f8a2b950777f3a9f002cd0e
Author: Huy Nguyen <huyn@mellanox.com>
Date:   Thu Aug 17 18:29:52 2017 +0300

    net/mlx4_core: Enable 4K UAR if SRIOV module parameter is not enabled
    
    
    [ Upstream commit ca3d89a3ebe79367bd41b6b8ba37664478ae2dba ]
    
    enable_4k_uar module parameter was added in patch cited below to
    address the backward compatibility issue in SRIOV when the VM has
    system's PAGE_SIZE uar implementation and the Hypervisor has 4k uar
    implementation.
    
    The above compatibility issue does not exist in the non SRIOV case.
    In this patch, we always enable 4k uar implementation if SRIOV
    is not enabled on mlx4's supported cards.
    
    Fixes: 76e39ccf9c36 ("net/mlx4_core: Fix backward compatibility on VFs")
    Signed-off-by: Huy Nguyen <huyn@mellanox.com>
    Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
    Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 2d093adfb109206702d469290a6ee6c83a222a71
Author: Neal Cardwell <ncardwell@google.com>
Date:   Wed Aug 16 17:53:36 2017 -0400

    tcp: when rearming RTO, if RTO time is in past then fire RTO ASAP
    
    
    [ Upstream commit cdbeb633ca71a02b7b63bfeb94994bf4e1a0b894 ]
    
    In some situations tcp_send_loss_probe() can realize that it's unable
    to send a loss probe (TLP), and falls back to calling tcp_rearm_rto()
    to schedule an RTO timer. In such cases, sometimes tcp_rearm_rto()
    realizes that the RTO was eligible to fire immediately or at some
    point in the past (delta_us <= 0). Previously in such cases
    tcp_rearm_rto() was scheduling such "overdue" RTOs to happen at now +
    icsk_rto, which caused needless delays of hundreds of milliseconds
    (and non-linear behavior that made reproducible testing
    difficult). This commit changes the logic to schedule "overdue" RTOs
    ASAP, rather than at now + icsk_rto.
    
    Fixes: 6ba8a3b19e76 ("tcp: Tail loss probe (TLP)")
    Suggested-by: Yuchung Cheng <ycheng@google.com>
    Signed-off-by: Neal Cardwell <ncardwell@google.com>
    Signed-off-by: Yuchung Cheng <ycheng@google.com>
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 7bbc60d9c916c0729f5049e9fa2fae89875cbfab
Author: Wei Wang <weiwan@google.com>
Date:   Fri Aug 18 17:14:49 2017 -0700

    ipv6: repair fib6 tree in failure case
    
    
    [ Upstream commit 348a4002729ccab8b888b38cbc099efa2f2a2036 ]
    
    In fib6_add(), it is possible that fib6_add_1() picks an intermediate
    node and sets the node's fn->leaf to NULL in order to add this new
    route. However, if fib6_add_rt2node() fails to add the new
    route for some reason, fn->leaf will be left as NULL and could
    potentially cause crash when fn->leaf is accessed in fib6_locate().
    This patch makes sure fib6_repair_tree() is called to properly repair
    fn->leaf in the above failure case.
    
    Here is the syzkaller reported general protection fault in fib6_locate:
    kasan: CONFIG_KASAN_INLINE enabled
    kasan: GPF could be caused by NULL-ptr deref or user memory access
    general protection fault: 0000 [#1] SMP KASAN
    Modules linked in:
    CPU: 0 PID: 40937 Comm: syz-executor3 Not tainted
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    task: ffff8801d7d64100 ti: ffff8801d01a0000 task.ti: ffff8801d01a0000
    RIP: 0010:[<ffffffff82a3e0e1>]  [<ffffffff82a3e0e1>] __ipv6_prefix_equal64_half include/net/ipv6.h:475 [inline]
    RIP: 0010:[<ffffffff82a3e0e1>]  [<ffffffff82a3e0e1>] ipv6_prefix_equal include/net/ipv6.h:492 [inline]
    RIP: 0010:[<ffffffff82a3e0e1>]  [<ffffffff82a3e0e1>] fib6_locate_1 net/ipv6/ip6_fib.c:1210 [inline]
    RIP: 0010:[<ffffffff82a3e0e1>]  [<ffffffff82a3e0e1>] fib6_locate+0x281/0x3c0 net/ipv6/ip6_fib.c:1233
    RSP: 0018:ffff8801d01a36a8  EFLAGS: 00010202
    RAX: 0000000000000020 RBX: ffff8801bc790e00 RCX: ffffc90002983000
    RDX: 0000000000001219 RSI: ffff8801d01a37a0 RDI: 0000000000000100
    RBP: ffff8801d01a36f0 R08: 00000000000000ff R09: 0000000000000000
    R10: 0000000000000003 R11: 0000000000000000 R12: 0000000000000001
    R13: dffffc0000000000 R14: ffff8801d01a37a0 R15: 0000000000000000
    FS:  00007f6afd68c700(0000) GS:ffff8801db400000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00000000004c6340 CR3: 00000000ba41f000 CR4: 00000000001426f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Stack:
     ffff8801d01a37a8 ffff8801d01a3780 ffffed003a0346f5 0000000c82a23ea0
     ffff8800b7bd7700 ffff8801d01a3780 ffff8800b6a1c940 ffffffff82a23ea0
     ffff8801d01a3920 ffff8801d01a3748 ffffffff82a223d6 ffff8801d7d64988
    Call Trace:
     [<ffffffff82a223d6>] ip6_route_del+0x106/0x570 net/ipv6/route.c:2109
     [<ffffffff82a23f9d>] inet6_rtm_delroute+0xfd/0x100 net/ipv6/route.c:3075
     [<ffffffff82621359>] rtnetlink_rcv_msg+0x549/0x7a0 net/core/rtnetlink.c:3450
     [<ffffffff8274c1d1>] netlink_rcv_skb+0x141/0x370 net/netlink/af_netlink.c:2281
     [<ffffffff82613ddf>] rtnetlink_rcv+0x2f/0x40 net/core/rtnetlink.c:3456
     [<ffffffff8274ad38>] netlink_unicast_kernel net/netlink/af_netlink.c:1206 [inline]
     [<ffffffff8274ad38>] netlink_unicast+0x518/0x750 net/netlink/af_netlink.c:1232
     [<ffffffff8274b83e>] netlink_sendmsg+0x8ce/0xc30 net/netlink/af_netlink.c:1778
     [<ffffffff82564aff>] sock_sendmsg_nosec net/socket.c:609 [inline]
     [<ffffffff82564aff>] sock_sendmsg+0xcf/0x110 net/socket.c:619
     [<ffffffff82564d62>] sock_write_iter+0x222/0x3a0 net/socket.c:834
     [<ffffffff8178523d>] new_sync_write+0x1dd/0x2b0 fs/read_write.c:478
     [<ffffffff817853f4>] __vfs_write+0xe4/0x110 fs/read_write.c:491
     [<ffffffff81786c38>] vfs_write+0x178/0x4b0 fs/read_write.c:538
     [<ffffffff817892a9>] SYSC_write fs/read_write.c:585 [inline]
     [<ffffffff817892a9>] SyS_write+0xd9/0x1b0 fs/read_write.c:577
     [<ffffffff82c71e32>] entry_SYSCALL_64_fastpath+0x12/0x17
    
    Note: there is no "Fixes" tag as this seems to be a bug introduced
    very early.
    
    Signed-off-by: Wei Wang <weiwan@google.com>
    Acked-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 368129fe14f6f27515c3b83ec6fb54bb9413de6c
Author: Wei Wang <weiwan@google.com>
Date:   Wed Aug 16 11:18:09 2017 -0700

    ipv6: reset fn->rr_ptr when replacing route
    
    
    [ Upstream commit 383143f31d7d3525a1dbff733d52fff917f82f15 ]
    
    syzcaller reported the following use-after-free issue in rt6_select():
    BUG: KASAN: use-after-free in rt6_select net/ipv6/route.c:755 [inline] at addr ffff8800bc6994e8
    BUG: KASAN: use-after-free in ip6_pol_route.isra.46+0x1429/0x1470 net/ipv6/route.c:1084 at addr ffff8800bc6994e8
    Read of size 4 by task syz-executor1/439628
    CPU: 0 PID: 439628 Comm: syz-executor1 Not tainted 4.3.5+ #8
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
     0000000000000000 ffff88018fe435b0 ffffffff81ca384d ffff8801d3588c00
     ffff8800bc699380 ffff8800bc699500 dffffc0000000000 ffff8801d40a47c0
     ffff88018fe435d8 ffffffff81735751 ffff88018fe43660 ffff8800bc699380
    Call Trace:
     [<ffffffff81ca384d>] __dump_stack lib/dump_stack.c:15 [inline]
     [<ffffffff81ca384d>] dump_stack+0xc1/0x124 lib/dump_stack.c:51
    sctp: [Deprecated]: syz-executor0 (pid 439615) Use of struct sctp_assoc_value in delayed_ack socket option.
    Use struct sctp_sack_info instead
     [<ffffffff81735751>] kasan_object_err+0x21/0x70 mm/kasan/report.c:158
     [<ffffffff817359c4>] print_address_description mm/kasan/report.c:196 [inline]
     [<ffffffff817359c4>] kasan_report_error+0x1b4/0x4a0 mm/kasan/report.c:285
     [<ffffffff81735d93>] kasan_report mm/kasan/report.c:305 [inline]
     [<ffffffff81735d93>] __asan_report_load4_noabort+0x43/0x50 mm/kasan/report.c:325
     [<ffffffff82a28e39>] rt6_select net/ipv6/route.c:755 [inline]
     [<ffffffff82a28e39>] ip6_pol_route.isra.46+0x1429/0x1470 net/ipv6/route.c:1084
     [<ffffffff82a28fb1>] ip6_pol_route_output+0x81/0xb0 net/ipv6/route.c:1203
     [<ffffffff82ab0a50>] fib6_rule_action+0x1f0/0x680 net/ipv6/fib6_rules.c:95
     [<ffffffff8265cbb6>] fib_rules_lookup+0x2a6/0x7a0 net/core/fib_rules.c:223
     [<ffffffff82ab1430>] fib6_rule_lookup+0xd0/0x250 net/ipv6/fib6_rules.c:41
     [<ffffffff82a22006>] ip6_route_output+0x1d6/0x2c0 net/ipv6/route.c:1224
     [<ffffffff829e83d2>] ip6_dst_lookup_tail+0x4d2/0x890 net/ipv6/ip6_output.c:943
     [<ffffffff829e889a>] ip6_dst_lookup_flow+0x9a/0x250 net/ipv6/ip6_output.c:1079
     [<ffffffff82a9f7d8>] ip6_datagram_dst_update+0x538/0xd40 net/ipv6/datagram.c:91
     [<ffffffff82aa0978>] __ip6_datagram_connect net/ipv6/datagram.c:251 [inline]
     [<ffffffff82aa0978>] ip6_datagram_connect+0x518/0xe50 net/ipv6/datagram.c:272
     [<ffffffff82aa1313>] ip6_datagram_connect_v6_only+0x63/0x90 net/ipv6/datagram.c:284
     [<ffffffff8292f790>] inet_dgram_connect+0x170/0x1f0 net/ipv4/af_inet.c:564
     [<ffffffff82565547>] SYSC_connect+0x1a7/0x2f0 net/socket.c:1582
     [<ffffffff8256a649>] SyS_connect+0x29/0x30 net/socket.c:1563
     [<ffffffff82c72032>] entry_SYSCALL_64_fastpath+0x12/0x17
    Object at ffff8800bc699380, in cache ip6_dst_cache size: 384
    
    The root cause of it is that in fib6_add_rt2node(), when it replaces an
    existing route with the new one, it does not update fn->rr_ptr.
    This commit resets fn->rr_ptr to NULL when it points to a route which is
    replaced in fib6_add_rt2node().
    
    Fixes: 27596472473a ("ipv6: fix ECMP route replacement")
    Signed-off-by: Wei Wang <weiwan@google.com>
    Acked-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit c549de482f89043eddda3ea33075180e8c479e49
Author: Eric Dumazet <edumazet@google.com>
Date:   Wed Aug 16 09:41:54 2017 -0700

    tipc: fix use-after-free
    
    
    [ Upstream commit 5bfd37b4de5c98e86b12bd13be5aa46c7484a125 ]
    
    syszkaller reported use-after-free in tipc [1]
    
    When msg->rep skb is freed, set the pointer to NULL,
    so that caller does not free it again.
    
    [1]
    
    ==================================================================
    BUG: KASAN: use-after-free in skb_push+0xd4/0xe0 net/core/skbuff.c:1466
    Read of size 8 at addr ffff8801c6e71e90 by task syz-executor5/4115
    
    CPU: 1 PID: 4115 Comm: syz-executor5 Not tainted 4.13.0-rc4+ #32
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Call Trace:
     __dump_stack lib/dump_stack.c:16 [inline]
     dump_stack+0x194/0x257 lib/dump_stack.c:52
     print_address_description+0x73/0x250 mm/kasan/report.c:252
     kasan_report_error mm/kasan/report.c:351 [inline]
     kasan_report+0x24e/0x340 mm/kasan/report.c:409
     __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:430
     skb_push+0xd4/0xe0 net/core/skbuff.c:1466
     tipc_nl_compat_recv+0x833/0x18f0 net/tipc/netlink_compat.c:1209
     genl_family_rcv_msg+0x7b7/0xfb0 net/netlink/genetlink.c:598
     genl_rcv_msg+0xb2/0x140 net/netlink/genetlink.c:623
     netlink_rcv_skb+0x216/0x440 net/netlink/af_netlink.c:2397
     genl_rcv+0x28/0x40 net/netlink/genetlink.c:634
     netlink_unicast_kernel net/netlink/af_netlink.c:1265 [inline]
     netlink_unicast+0x4e8/0x6f0 net/netlink/af_netlink.c:1291
     netlink_sendmsg+0xa4a/0xe60 net/netlink/af_netlink.c:1854
     sock_sendmsg_nosec net/socket.c:633 [inline]
     sock_sendmsg+0xca/0x110 net/socket.c:643
     sock_write_iter+0x31a/0x5d0 net/socket.c:898
     call_write_iter include/linux/fs.h:1743 [inline]
     new_sync_write fs/read_write.c:457 [inline]
     __vfs_write+0x684/0x970 fs/read_write.c:470
     vfs_write+0x189/0x510 fs/read_write.c:518
     SYSC_write fs/read_write.c:565 [inline]
     SyS_write+0xef/0x220 fs/read_write.c:557
     entry_SYSCALL_64_fastpath+0x1f/0xbe
    RIP: 0033:0x4512e9
    RSP: 002b:00007f3bc8184c08 EFLAGS: 00000216 ORIG_RAX: 0000000000000001
    RAX: ffffffffffffffda RBX: 0000000000718000 RCX: 00000000004512e9
    RDX: 0000000000000020 RSI: 0000000020fdb000 RDI: 0000000000000006
    RBP: 0000000000000086 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000216 R12: 00000000004b5e76
    R13: 00007f3bc8184b48 R14: 00000000004b5e86 R15: 0000000000000000
    
    Allocated by task 4115:
     save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
     save_stack+0x43/0xd0 mm/kasan/kasan.c:447
     set_track mm/kasan/kasan.c:459 [inline]
     kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:551
     kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:489
     kmem_cache_alloc_node+0x13d/0x750 mm/slab.c:3651
     __alloc_skb+0xf1/0x740 net/core/skbuff.c:219
     alloc_skb include/linux/skbuff.h:903 [inline]
     tipc_tlv_alloc+0x26/0xb0 net/tipc/netlink_compat.c:148
     tipc_nl_compat_dumpit+0xf2/0x3c0 net/tipc/netlink_compat.c:248
     tipc_nl_compat_handle net/tipc/netlink_compat.c:1130 [inline]
     tipc_nl_compat_recv+0x756/0x18f0 net/tipc/netlink_compat.c:1199
     genl_family_rcv_msg+0x7b7/0xfb0 net/netlink/genetlink.c:598
     genl_rcv_msg+0xb2/0x140 net/netlink/genetlink.c:623
     netlink_rcv_skb+0x216/0x440 net/netlink/af_netlink.c:2397
     genl_rcv+0x28/0x40 net/netlink/genetlink.c:634
     netlink_unicast_kernel net/netlink/af_netlink.c:1265 [inline]
     netlink_unicast+0x4e8/0x6f0 net/netlink/af_netlink.c:1291
     netlink_sendmsg+0xa4a/0xe60 net/netlink/af_netlink.c:1854
     sock_sendmsg_nosec net/socket.c:633 [inline]
     sock_sendmsg+0xca/0x110 net/socket.c:643
     sock_write_iter+0x31a/0x5d0 net/socket.c:898
     call_write_iter include/linux/fs.h:1743 [inline]
     new_sync_write fs/read_write.c:457 [inline]
     __vfs_write+0x684/0x970 fs/read_write.c:470
     vfs_write+0x189/0x510 fs/read_write.c:518
     SYSC_write fs/read_write.c:565 [inline]
     SyS_write+0xef/0x220 fs/read_write.c:557
     entry_SYSCALL_64_fastpath+0x1f/0xbe
    
    Freed by task 4115:
     save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
     save_stack+0x43/0xd0 mm/kasan/kasan.c:447
     set_track mm/kasan/kasan.c:459 [inline]
     kasan_slab_free+0x71/0xc0 mm/kasan/kasan.c:524
     __cache_free mm/slab.c:3503 [inline]
     kmem_cache_free+0x77/0x280 mm/slab.c:3763
     kfree_skbmem+0x1a1/0x1d0 net/core/skbuff.c:622
     __kfree_skb net/core/skbuff.c:682 [inline]
     kfree_skb+0x165/0x4c0 net/core/skbuff.c:699
     tipc_nl_compat_dumpit+0x36a/0x3c0 net/tipc/netlink_compat.c:260
     tipc_nl_compat_handle net/tipc/netlink_compat.c:1130 [inline]
     tipc_nl_compat_recv+0x756/0x18f0 net/tipc/netlink_compat.c:1199
     genl_family_rcv_msg+0x7b7/0xfb0 net/netlink/genetlink.c:598
     genl_rcv_msg+0xb2/0x140 net/netlink/genetlink.c:623
     netlink_rcv_skb+0x216/0x440 net/netlink/af_netlink.c:2397
     genl_rcv+0x28/0x40 net/netlink/genetlink.c:634
     netlink_unicast_kernel net/netlink/af_netlink.c:1265 [inline]
     netlink_unicast+0x4e8/0x6f0 net/netlink/af_netlink.c:1291
     netlink_sendmsg+0xa4a/0xe60 net/netlink/af_netlink.c:1854
     sock_sendmsg_nosec net/socket.c:633 [inline]
     sock_sendmsg+0xca/0x110 net/socket.c:643
     sock_write_iter+0x31a/0x5d0 net/socket.c:898
     call_write_iter include/linux/fs.h:1743 [inline]
     new_sync_write fs/read_write.c:457 [inline]
     __vfs_write+0x684/0x970 fs/read_write.c:470
     vfs_write+0x189/0x510 fs/read_write.c:518
     SYSC_write fs/read_write.c:565 [inline]
     SyS_write+0xef/0x220 fs/read_write.c:557
     entry_SYSCALL_64_fastpath+0x1f/0xbe
    
    The buggy address belongs to the object at ffff8801c6e71dc0
     which belongs to the cache skbuff_head_cache of size 224
    The buggy address is located 208 bytes inside of
     224-byte region [ffff8801c6e71dc0, ffff8801c6e71ea0)
    The buggy address belongs to the page:
    page:ffffea00071b9c40 count:1 mapcount:0 mapping:ffff8801c6e71000 index:0x0
    flags: 0x200000000000100(slab)
    raw: 0200000000000100 ffff8801c6e71000 0000000000000000 000000010000000c
    raw: ffffea0007224a20 ffff8801d98caf48 ffff8801d9e79040 0000000000000000
    page dumped because: kasan: bad access detected
    
    Memory state around the buggy address:
     ffff8801c6e71d80: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
     ffff8801c6e71e00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    >ffff8801c6e71e80: fb fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc
                             ^
     ffff8801c6e71f00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
     ffff8801c6e71f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    ==================================================================
    
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reported-by: Dmitry Vyukov  <dvyukov@google.com>
    Cc: Jon Maloy <jon.maloy@ericsson.com>
    Cc: Ying Xue <ying.xue@windriver.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 62b3580fc3f05d7d2c15284e36f7ce1c2225bdcf
Author: Alexander Potapenko <glider@google.com>
Date:   Wed Aug 16 20:16:40 2017 +0200

    sctp: fully initialize the IPv6 address in sctp_v6_to_addr()
    
    
    [ Upstream commit 15339e441ec46fbc3bf3486bb1ae4845b0f1bb8d ]
    
    KMSAN reported use of uninitialized sctp_addr->v4.sin_addr.s_addr and
    sctp_addr->v6.sin6_scope_id in sctp_v6_cmp_addr() (see below).
    Make sure all fields of an IPv6 address are initialized, which
    guarantees that the IPv4 fields are also initialized.
    
    ==================================================================
     BUG: KMSAN: use of uninitialized memory in sctp_v6_cmp_addr+0x8d4/0x9f0
     net/sctp/ipv6.c:517
     CPU: 2 PID: 31056 Comm: syz-executor1 Not tainted 4.11.0-rc5+ #2944
     Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
     01/01/2011
     Call Trace:
      dump_stack+0x172/0x1c0 lib/dump_stack.c:42
      is_logbuf_locked mm/kmsan/kmsan.c:59 [inline]
      kmsan_report+0x12a/0x180 mm/kmsan/kmsan.c:938
      native_save_fl arch/x86/include/asm/irqflags.h:18 [inline]
      arch_local_save_flags arch/x86/include/asm/irqflags.h:72 [inline]
      arch_local_irq_save arch/x86/include/asm/irqflags.h:113 [inline]
      __msan_warning_32+0x61/0xb0 mm/kmsan/kmsan_instr.c:467
      sctp_v6_cmp_addr+0x8d4/0x9f0 net/sctp/ipv6.c:517
      sctp_v6_get_dst+0x8c7/0x1630 net/sctp/ipv6.c:290
      sctp_transport_route+0x101/0x570 net/sctp/transport.c:292
      sctp_assoc_add_peer+0x66d/0x16f0 net/sctp/associola.c:651
      sctp_sendmsg+0x35a5/0x4f90 net/sctp/socket.c:1871
      inet_sendmsg+0x498/0x670 net/ipv4/af_inet.c:762
      sock_sendmsg_nosec net/socket.c:633 [inline]
      sock_sendmsg net/socket.c:643 [inline]
      SYSC_sendto+0x608/0x710 net/socket.c:1696
      SyS_sendto+0x8a/0xb0 net/socket.c:1664
      entry_SYSCALL_64_fastpath+0x13/0x94
     RIP: 0033:0x44b479
     RSP: 002b:00007f6213f21c08 EFLAGS: 00000286 ORIG_RAX: 000000000000002c
     RAX: ffffffffffffffda RBX: 0000000020000000 RCX: 000000000044b479
     RDX: 0000000000000041 RSI: 0000000020edd000 RDI: 0000000000000006
     RBP: 00000000007080a8 R08: 0000000020b85fe4 R09: 000000000000001c
     R10: 0000000000040005 R11: 0000000000000286 R12: 00000000ffffffff
     R13: 0000000000003760 R14: 00000000006e5820 R15: 0000000000ff8000
     origin description: ----dst_saddr@sctp_v6_get_dst
     local variable created at:
      sk_fullsock include/net/sock.h:2321 [inline]
      inet6_sk include/linux/ipv6.h:309 [inline]
      sctp_v6_get_dst+0x91/0x1630 net/sctp/ipv6.c:241
      sctp_transport_route+0x101/0x570 net/sctp/transport.c:292
    ==================================================================
     BUG: KMSAN: use of uninitialized memory in sctp_v6_cmp_addr+0x8d4/0x9f0
     net/sctp/ipv6.c:517
     CPU: 2 PID: 31056 Comm: syz-executor1 Not tainted 4.11.0-rc5+ #2944
     Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
     01/01/2011
     Call Trace:
      dump_stack+0x172/0x1c0 lib/dump_stack.c:42
      is_logbuf_locked mm/kmsan/kmsan.c:59 [inline]
      kmsan_report+0x12a/0x180 mm/kmsan/kmsan.c:938
      native_save_fl arch/x86/include/asm/irqflags.h:18 [inline]
      arch_local_save_flags arch/x86/include/asm/irqflags.h:72 [inline]
      arch_local_irq_save arch/x86/include/asm/irqflags.h:113 [inline]
      __msan_warning_32+0x61/0xb0 mm/kmsan/kmsan_instr.c:467
      sctp_v6_cmp_addr+0x8d4/0x9f0 net/sctp/ipv6.c:517
      sctp_v6_get_dst+0x8c7/0x1630 net/sctp/ipv6.c:290
      sctp_transport_route+0x101/0x570 net/sctp/transport.c:292
      sctp_assoc_add_peer+0x66d/0x16f0 net/sctp/associola.c:651
      sctp_sendmsg+0x35a5/0x4f90 net/sctp/socket.c:1871
      inet_sendmsg+0x498/0x670 net/ipv4/af_inet.c:762
      sock_sendmsg_nosec net/socket.c:633 [inline]
      sock_sendmsg net/socket.c:643 [inline]
      SYSC_sendto+0x608/0x710 net/socket.c:1696
      SyS_sendto+0x8a/0xb0 net/socket.c:1664
      entry_SYSCALL_64_fastpath+0x13/0x94
     RIP: 0033:0x44b479
     RSP: 002b:00007f6213f21c08 EFLAGS: 00000286 ORIG_RAX: 000000000000002c
     RAX: ffffffffffffffda RBX: 0000000020000000 RCX: 000000000044b479
     RDX: 0000000000000041 RSI: 0000000020edd000 RDI: 0000000000000006
     RBP: 00000000007080a8 R08: 0000000020b85fe4 R09: 000000000000001c
     R10: 0000000000040005 R11: 0000000000000286 R12: 00000000ffffffff
     R13: 0000000000003760 R14: 00000000006e5820 R15: 0000000000ff8000
     origin description: ----dst_saddr@sctp_v6_get_dst
     local variable created at:
      sk_fullsock include/net/sock.h:2321 [inline]
      inet6_sk include/linux/ipv6.h:309 [inline]
      sctp_v6_get_dst+0x91/0x1630 net/sctp/ipv6.c:241
      sctp_transport_route+0x101/0x570 net/sctp/transport.c:292
    ==================================================================
    
    Signed-off-by: Alexander Potapenko <glider@google.com>
    Reviewed-by: Xin Long <lucien.xin@gmail.com>
    Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit dda844773c47c5695759faf03215ea01d6018717
Author: Eric Dumazet <edumazet@google.com>
Date:   Fri Aug 18 13:39:56 2017 -0700

    tun: handle register_netdevice() failures properly
    
    
    [ Upstream commit ff244c6b29b176f3f448bc75e55df297225e1b3a ]
    
    syzkaller reported a double free [1], caused by the fact
    that tun driver was not updated properly when priv_destructor
    was added.
    
    When/if register_netdevice() fails, priv_destructor() must have been
    called already.
    
    [1]
    BUG: KASAN: double-free or invalid-free in selinux_tun_dev_free_security+0x15/0x20 security/selinux/hooks.c:5023
    
    CPU: 0 PID: 2919 Comm: syzkaller227220 Not tainted 4.13.0-rc4+ #23
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Call Trace:
     __dump_stack lib/dump_stack.c:16 [inline]
     dump_stack+0x194/0x257 lib/dump_stack.c:52
     print_address_description+0x7f/0x260 mm/kasan/report.c:252
     kasan_report_double_free+0x55/0x80 mm/kasan/report.c:333
     kasan_slab_free+0xa0/0xc0 mm/kasan/kasan.c:514
     __cache_free mm/slab.c:3503 [inline]
     kfree+0xd3/0x260 mm/slab.c:3820
     selinux_tun_dev_free_security+0x15/0x20 security/selinux/hooks.c:5023
     security_tun_dev_free_security+0x48/0x80 security/security.c:1512
     tun_set_iff drivers/net/tun.c:1884 [inline]
     __tun_chr_ioctl+0x2ce6/0x3d50 drivers/net/tun.c:2064
     tun_chr_ioctl+0x2a/0x40 drivers/net/tun.c:2309
     vfs_ioctl fs/ioctl.c:45 [inline]
     do_vfs_ioctl+0x1b1/0x1520 fs/ioctl.c:685
     SYSC_ioctl fs/ioctl.c:700 [inline]
     SyS_ioctl+0x8f/0xc0 fs/ioctl.c:691
     entry_SYSCALL_64_fastpath+0x1f/0xbe
    RIP: 0033:0x443ff9
    RSP: 002b:00007ffc34271f68 EFLAGS: 00000217 ORIG_RAX: 0000000000000010
    RAX: ffffffffffffffda RBX: 00000000004002e0 RCX: 0000000000443ff9
    RDX: 0000000020533000 RSI: 00000000400454ca RDI: 0000000000000003
    RBP: 0000000000000086 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000217 R12: 0000000000401ce0
    R13: 0000000000401d70 R14: 0000000000000000 R15: 0000000000000000
    
    Allocated by task 2919:
     save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
     save_stack+0x43/0xd0 mm/kasan/kasan.c:447
     set_track mm/kasan/kasan.c:459 [inline]
     kasan_kmalloc+0xaa/0xd0 mm/kasan/kasan.c:551
     kmem_cache_alloc_trace+0x101/0x6f0 mm/slab.c:3627
     kmalloc include/linux/slab.h:493 [inline]
     kzalloc include/linux/slab.h:666 [inline]
     selinux_tun_dev_alloc_security+0x49/0x170 security/selinux/hooks.c:5012
     security_tun_dev_alloc_security+0x6d/0xa0 security/security.c:1506
     tun_set_iff drivers/net/tun.c:1839 [inline]
     __tun_chr_ioctl+0x1730/0x3d50 drivers/net/tun.c:2064
     tun_chr_ioctl+0x2a/0x40 drivers/net/tun.c:2309
     vfs_ioctl fs/ioctl.c:45 [inline]
     do_vfs_ioctl+0x1b1/0x1520 fs/ioctl.c:685
     SYSC_ioctl fs/ioctl.c:700 [inline]
     SyS_ioctl+0x8f/0xc0 fs/ioctl.c:691
     entry_SYSCALL_64_fastpath+0x1f/0xbe
    
    Freed by task 2919:
     save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
     save_stack+0x43/0xd0 mm/kasan/kasan.c:447
     set_track mm/kasan/kasan.c:459 [inline]
     kasan_slab_free+0x6e/0xc0 mm/kasan/kasan.c:524
     __cache_free mm/slab.c:3503 [inline]
     kfree+0xd3/0x260 mm/slab.c:3820
     selinux_tun_dev_free_security+0x15/0x20 security/selinux/hooks.c:5023
     security_tun_dev_free_security+0x48/0x80 security/security.c:1512
     tun_free_netdev+0x13b/0x1b0 drivers/net/tun.c:1563
     register_netdevice+0x8d0/0xee0 net/core/dev.c:7605
     tun_set_iff drivers/net/tun.c:1859 [inline]
     __tun_chr_ioctl+0x1caf/0x3d50 drivers/net/tun.c:2064
     tun_chr_ioctl+0x2a/0x40 drivers/net/tun.c:2309
     vfs_ioctl fs/ioctl.c:45 [inline]
     do_vfs_ioctl+0x1b1/0x1520 fs/ioctl.c:685
     SYSC_ioctl fs/ioctl.c:700 [inline]
     SyS_ioctl+0x8f/0xc0 fs/ioctl.c:691
     entry_SYSCALL_64_fastpath+0x1f/0xbe
    
    The buggy address belongs to the object at ffff8801d2843b40
     which belongs to the cache kmalloc-32 of size 32
    The buggy address is located 0 bytes inside of
     32-byte region [ffff8801d2843b40, ffff8801d2843b60)
    The buggy address belongs to the page:
    page:ffffea000660cea8 count:1 mapcount:0 mapping:ffff8801d2843000 index:0xffff8801d2843fc1
    flags: 0x200000000000100(slab)
    raw: 0200000000000100 ffff8801d2843000 ffff8801d2843fc1 000000010000003f
    raw: ffffea0006626a40 ffffea00066141a0 ffff8801dbc00100
    page dumped because: kasan: bad access detected
    
    Memory state around the buggy address:
     ffff8801d2843a00: fb fb fb fb fc fc fc fc fb fb fb fb fc fc fc fc
     ffff8801d2843a80: 00 00 00 fc fc fc fc fc fb fb fb fb fc fc fc fc
    >ffff8801d2843b00: 00 00 00 00 fc fc fc fc fb fb fb fb fc fc fc fc
                                               ^
     ffff8801d2843b80: fb fb fb fb fc fc fc fc fb fb fb fb fc fc fc fc
     ffff8801d2843c00: fb fb fb fb fc fc fc fc fb fb fb fb fc fc fc fc
    
    ==================================================================
    
    Fixes: cf124db566e6 ("net: Fix inconsistent teardown and release of private netdev state.")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 3c3181e17b3596bb2c24a23b8e0293f813f71397
Author: Colin Ian King <colin.king@canonical.com>
Date:   Fri Aug 18 12:11:50 2017 +0100

    nfp: fix infinite loop on umapping cleanup
    
    
    [ Upstream commit eac2c68d663effb077210218788952b5a0c1f60e ]
    
    The while loop that performs the dma page unmapping never decrements
    index counter f and hence loops forever. Fix this with a pre-decrement
    on f.
    
    Detected by CoverityScan, CID#1357309 ("Infinite loop")
    
    Fixes: 4c3523623dc0 ("net: add driver for Netronome NFP4000/NFP6000 NIC VFs")
    Signed-off-by: Colin Ian King <colin.king@canonical.com>
    Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 9c579acf65228945a02e674f8c5766571a63b51f
Author: Eric Dumazet <edumazet@google.com>
Date:   Wed Aug 16 11:09:12 2017 -0700

    ipv4: better IP_MAX_MTU enforcement
    
    
    [ Upstream commit c780a049f9bf442314335372c9abc4548bfe3e44 ]
    
    While working on yet another syzkaller report, I found
    that our IP_MAX_MTU enforcements were not properly done.
    
    gcc seems to reload dev->mtu for min(dev->mtu, IP_MAX_MTU), and
    final result can be bigger than IP_MAX_MTU :/
    
    This is a problem because device mtu can be changed on other cpus or
    threads.
    
    While this patch does not fix the issue I am working on, it is
    probably worth addressing it.
    
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 12ee6d75d6a1e65e902aca105679d46291c16e40
Author: Eric Dumazet <edumazet@google.com>
Date:   Wed Aug 16 10:36:47 2017 -0700

    ptr_ring: use kmalloc_array()
    
    
    [ Upstream commit 81fbfe8adaf38d4f5a98c19bebfd41c5d6acaee8 ]
    
    As found by syzkaller, malicious users can set whatever tx_queue_len
    on a tun device and eventually crash the kernel.
    
    Lets remove the ALIGN(XXX, SMP_CACHE_BYTES) thing since a small
    ring buffer is not fast anyway.
    
    Fixes: 2e0ab8ca83c1 ("ptr_ring: array based FIFO for pointers")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reported-by: Dmitry Vyukov <dvyukov@google.com>
    Cc: Michael S. Tsirkin <mst@redhat.com>
    Cc: Jason Wang <jasowang@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit cb445bfc10416e5d731d5036661fc6d7c0863545
Author: Liping Zhang <zlpnobody@gmail.com>
Date:   Wed Aug 16 13:30:07 2017 +0800

    openvswitch: fix skb_panic due to the incorrect actions attrlen
    
    
    [ Upstream commit 494bea39f3201776cdfddc232705f54a0bd210c4 ]
    
    For sw_flow_actions, the actions_len only represents the kernel part's
    size, and when we dump the actions to the userspace, we will do the
    convertions, so it's true size may become bigger than the actions_len.
    
    But unfortunately, for OVS_PACKET_ATTR_ACTIONS, we use the actions_len
    to alloc the skbuff, so the user_skb's size may become insufficient and
    oops will happen like this:
      skbuff: skb_over_panic: text:ffffffff8148fabf len:1749 put:157 head:
      ffff881300f39000 data:ffff881300f39000 tail:0x6d5 end:0x6c0 dev:<NULL>
      ------------[ cut here ]------------
      kernel BUG at net/core/skbuff.c:129!
      [...]
      Call Trace:
       <IRQ>
       [<ffffffff8148be82>] skb_put+0x43/0x44
       [<ffffffff8148fabf>] skb_zerocopy+0x6c/0x1f4
       [<ffffffffa0290d36>] queue_userspace_packet+0x3a3/0x448 [openvswitch]
       [<ffffffffa0292023>] ovs_dp_upcall+0x30/0x5c [openvswitch]
       [<ffffffffa028d435>] output_userspace+0x132/0x158 [openvswitch]
       [<ffffffffa01e6890>] ? ip6_rcv_finish+0x74/0x77 [ipv6]
       [<ffffffffa028e277>] do_execute_actions+0xcc1/0xdc8 [openvswitch]
       [<ffffffffa028e3f2>] ovs_execute_actions+0x74/0x106 [openvswitch]
       [<ffffffffa0292130>] ovs_dp_process_packet+0xe1/0xfd [openvswitch]
       [<ffffffffa0292b77>] ? key_extract+0x63c/0x8d5 [openvswitch]
       [<ffffffffa029848b>] ovs_vport_receive+0xa1/0xc3 [openvswitch]
      [...]
    
    Also we can find that the actions_len is much little than the orig_len:
      crash> struct sw_flow_actions 0xffff8812f539d000
      struct sw_flow_actions {
        rcu = {
          next = 0xffff8812f5398800,
          func = 0xffffe3b00035db32
        },
        orig_len = 1384,
        actions_len = 592,
        actions = 0xffff8812f539d01c
      }
    
    So as a quick fix, use the orig_len instead of the actions_len to alloc
    the user_skb.
    
    Last, this oops happened on our system running a relative old kernel, but
    the same risk still exists on the mainline, since we use the wrong
    actions_len from the beginning.
    
    Fixes: ccea74457bbd ("openvswitch: include datapath actions with sampled-packet upcall to userspace")
    Cc: Neil McKee <neil.mckee@inmon.com>
    Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
    Acked-by: Pravin B Shelar <pshelar@ovn.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit c6fc7b9892a5c28f8b593247dd1eef5df4a1490c
Author: David Ahern <dsahern@gmail.com>
Date:   Tue Aug 15 18:38:42 2017 -0700

    net: igmp: Use ingress interface rather than vrf device
    
    
    [ Upstream commit c7b725be84985532161bcb4fbecd056326998a77 ]
    
    Anuradha reported that statically added groups for interfaces enslaved
    to a VRF device were not persisting. The problem is that igmp queries
    and reports need to use the data in the in_dev for the real ingress
    device rather than the VRF device. Update igmp_rcv accordingly.
    
    Fixes: e58e41596811 ("net: Enable support for VRF with ipv4 multicast")
    Reported-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
    Signed-off-by: David Ahern <dsahern@gmail.com>
    Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 921739a95d4fcbdf205c8f9f8a0b9be20e51a5b3
Author: Daniel Borkmann <daniel@iogearbox.net>
Date:   Wed Aug 16 01:45:33 2017 +0200

    bpf: fix bpf_trace_printk on 32 bit archs
    
    
    [ Upstream commit 88a5c690b66110ad255380d8f629c629cf6ca559 ]
    
    James reported that on MIPS32 bpf_trace_printk() is currently
    broken while MIPS64 works fine:
    
      bpf_trace_printk() uses conditional operators to attempt to
      pass different types to __trace_printk() depending on the
      format operators. This doesn't work as intended on 32-bit
      architectures where u32 and long are passed differently to
      u64, since the result of C conditional operators follows the
      "usual arithmetic conversions" rules, such that the values
      passed to __trace_printk() will always be u64 [causing issues
      later in the va_list handling for vscnprintf()].
    
      For example the samples/bpf/tracex5 test printed lines like
      below on MIPS32, where the fd and buf have come from the u64
      fd argument, and the size from the buf argument:
    
        [...] 1180.941542: 0x00000001: write(fd=1, buf=  (null), size=6258688)
    
      Instead of this:
    
        [...] 1625.616026: 0x00000001: write(fd=1, buf=009e4000, size=512)
    
    One way to get it working is to expand various combinations
    of argument types into 8 different combinations for 32 bit
    and 64 bit kernels. Fix tested by James on MIPS32 and MIPS64
    as well that it resolves the issue.
    
    Fixes: 9c959c863f82 ("tracing: Allow BPF programs to call bpf_trace_printk()")
    Reported-by: James Hogan <james.hogan@imgtec.com>
    Tested-by: James Hogan <james.hogan@imgtec.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 99f635d1e70dbb3e0e61e4b1eef9cc5b18c0414b
Author: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Date:   Tue Aug 15 16:39:05 2017 +0300

    net_sched: remove warning from qdisc_hash_add
    
    
    [ Upstream commit c90e95147c27b1780e76c6e8fea1b5c78d7d387f ]
    
    It was added in commit e57a784d8cae ("pkt_sched: set root qdisc
    before change() in attach_default_qdiscs()") to hide duplicates
    from "tc qdisc show" for incative deivices.
    
    After 59cc1f61f ("net: sched: convert qdisc linked list to hashtable")
    it triggered when classful qdisc is added to inactive device because
    default qdiscs are added before switching root qdisc.
    
    Anyway after commit ea3274695353 ("net: sched: avoid duplicates in
    qdisc dump") duplicates are filtered right in dumper.
    
    Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit cf665a603368cc6194a973bf9765c79803530d17
Author: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Date:   Tue Aug 15 16:37:04 2017 +0300

    net_sched/sfq: update hierarchical backlog when drop packet
    
    
    [ Upstream commit 325d5dc3f7e7c2840b65e4a2988c082c2c0025c5 ]
    
    When sfq_enqueue() drops head packet or packet from another queue it
    have to update backlog at upper qdiscs too.
    
    Fixes: 2ccccf5fb43f ("net_sched: update hierarchical backlog too")
    Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
    Acked-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 163db2c61aa16b7c71f5795c58e62f0ee5f84a01
Author: Eric Dumazet <edumazet@google.com>
Date:   Tue Aug 15 05:26:17 2017 -0700

    ipv4: fix NULL dereference in free_fib_info_rcu()
    
    
    [ Upstream commit 187e5b3ac84d3421d2de3aca949b2791fbcad554 ]
    
    If fi->fib_metrics could not be allocated in fib_create_info()
    we attempt to dereference a NULL pointer in free_fib_info_rcu() :
    
        m = fi->fib_metrics;
        if (m != &dst_default_metrics && atomic_dec_and_test(&m->refcnt))
                kfree(m);
    
    Before my recent patch, we used to call kfree(NULL) and nothing wrong
    happened.
    
    Instead of using RCU to defer freeing while we are under memory stress,
    it seems better to take immediate action.
    
    This was reported by syzkaller team.
    
    Fixes: 3fb07daff8e9 ("ipv4: add reference counting to metrics")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reported-by: Dmitry Vyukov <dvyukov@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit f1d0554639802449428e72c7e19c04d8ebc90665
Author: Eric Dumazet <edumazet@google.com>
Date:   Wed Aug 16 07:03:15 2017 -0700

    dccp: defer ccid_hc_tx_delete() at dismantle time
    
    
    [ Upstream commit 120e9dabaf551c6dc03d3a10a1f026376cb1811c ]
    
    syszkaller team reported another problem in DCCP [1]
    
    Problem here is that the structure holding RTO timer
    (ccid2_hc_tx_rto_expire() handler) is freed too soon.
    
    We can not use del_timer_sync() to cancel the timer
    since this timer wants to grab socket lock (that would risk a dead lock)
    
    Solution is to defer the freeing of memory when all references to
    the socket were released. Socket timers do own a reference, so this
    should fix the issue.
    
    [1]
    
    ==================================================================
    BUG: KASAN: use-after-free in ccid2_hc_tx_rto_expire+0x51c/0x5c0 net/dccp/ccids/ccid2.c:144
    Read of size 4 at addr ffff8801d2660540 by task kworker/u4:7/3365
    
    CPU: 1 PID: 3365 Comm: kworker/u4:7 Not tainted 4.13.0-rc4+ #3
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Workqueue: events_unbound call_usermodehelper_exec_work
    Call Trace:
     <IRQ>
     __dump_stack lib/dump_stack.c:16 [inline]
     dump_stack+0x194/0x257 lib/dump_stack.c:52
     print_address_description+0x73/0x250 mm/kasan/report.c:252
     kasan_report_error mm/kasan/report.c:351 [inline]
     kasan_report+0x24e/0x340 mm/kasan/report.c:409
     __asan_report_load4_noabort+0x14/0x20 mm/kasan/report.c:429
     ccid2_hc_tx_rto_expire+0x51c/0x5c0 net/dccp/ccids/ccid2.c:144
     call_timer_fn+0x233/0x830 kernel/time/timer.c:1268
     expire_timers kernel/time/timer.c:1307 [inline]
     __run_timers+0x7fd/0xb90 kernel/time/timer.c:1601
     run_timer_softirq+0x21/0x80 kernel/time/timer.c:1614
     __do_softirq+0x2f5/0xba3 kernel/softirq.c:284
     invoke_softirq kernel/softirq.c:364 [inline]
     irq_exit+0x1cc/0x200 kernel/softirq.c:405
     exiting_irq arch/x86/include/asm/apic.h:638 [inline]
     smp_apic_timer_interrupt+0x76/0xa0 arch/x86/kernel/apic/apic.c:1044
     apic_timer_interrupt+0x93/0xa0 arch/x86/entry/entry_64.S:702
    RIP: 0010:arch_local_irq_enable arch/x86/include/asm/paravirt.h:824 [inline]
    RIP: 0010:__raw_write_unlock_irq include/linux/rwlock_api_smp.h:267 [inline]
    RIP: 0010:_raw_write_unlock_irq+0x56/0x70 kernel/locking/spinlock.c:343
    RSP: 0018:ffff8801cd50eaa8 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff10
    RAX: dffffc0000000000 RBX: ffffffff85a090c0 RCX: 0000000000000006
    RDX: 1ffffffff0b595f3 RSI: 1ffff1003962f989 RDI: ffffffff85acaf98
    RBP: ffff8801cd50eab0 R08: 0000000000000001 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801cc96ea60
    R13: dffffc0000000000 R14: ffff8801cc96e4c0 R15: ffff8801cc96e4c0
     </IRQ>
     release_task+0xe9e/0x1a40 kernel/exit.c:220
     wait_task_zombie kernel/exit.c:1162 [inline]
     wait_consider_task+0x29b8/0x33c0 kernel/exit.c:1389
     do_wait_thread kernel/exit.c:1452 [inline]
     do_wait+0x441/0xa90 kernel/exit.c:1523
     kernel_wait4+0x1f5/0x370 kernel/exit.c:1665
     SYSC_wait4+0x134/0x140 kernel/exit.c:1677
     SyS_wait4+0x2c/0x40 kernel/exit.c:1673
     call_usermodehelper_exec_sync kernel/kmod.c:286 [inline]
     call_usermodehelper_exec_work+0x1a0/0x2c0 kernel/kmod.c:323
     process_one_work+0xbf3/0x1bc0 kernel/workqueue.c:2097
     worker_thread+0x223/0x1860 kernel/workqueue.c:2231
     kthread+0x35e/0x430 kernel/kthread.c:231
     ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:425
    
    Allocated by task 21267:
     save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
     save_stack+0x43/0xd0 mm/kasan/kasan.c:447
     set_track mm/kasan/kasan.c:459 [inline]
     kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:551
     kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:489
     kmem_cache_alloc+0x127/0x750 mm/slab.c:3561
     ccid_new+0x20e/0x390 net/dccp/ccid.c:151
     dccp_hdlr_ccid+0x27/0x140 net/dccp/feat.c:44
     __dccp_feat_activate+0x142/0x2a0 net/dccp/feat.c:344
     dccp_feat_activate_values+0x34e/0xa90 net/dccp/feat.c:1538
     dccp_rcv_request_sent_state_process net/dccp/input.c:472 [inline]
     dccp_rcv_state_process+0xed1/0x1620 net/dccp/input.c:677
     dccp_v4_do_rcv+0xeb/0x160 net/dccp/ipv4.c:679
     sk_backlog_rcv include/net/sock.h:911 [inline]
     __release_sock+0x124/0x360 net/core/sock.c:2269
     release_sock+0xa4/0x2a0 net/core/sock.c:2784
     inet_wait_for_connect net/ipv4/af_inet.c:557 [inline]
     __inet_stream_connect+0x671/0xf00 net/ipv4/af_inet.c:643
     inet_stream_connect+0x58/0xa0 net/ipv4/af_inet.c:682
     SYSC_connect+0x204/0x470 net/socket.c:1642
     SyS_connect+0x24/0x30 net/socket.c:1623
     entry_SYSCALL_64_fastpath+0x1f/0xbe
    
    Freed by task 3049:
     save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
     save_stack+0x43/0xd0 mm/kasan/kasan.c:447
     set_track mm/kasan/kasan.c:459 [inline]
     kasan_slab_free+0x71/0xc0 mm/kasan/kasan.c:524
     __cache_free mm/slab.c:3503 [inline]
     kmem_cache_free+0x77/0x280 mm/slab.c:3763
     ccid_hc_tx_delete+0xc5/0x100 net/dccp/ccid.c:190
     dccp_destroy_sock+0x1d1/0x2b0 net/dccp/proto.c:225
     inet_csk_destroy_sock+0x166/0x3f0 net/ipv4/inet_connection_sock.c:833
     dccp_done+0xb7/0xd0 net/dccp/proto.c:145
     dccp_time_wait+0x13d/0x300 net/dccp/minisocks.c:72
     dccp_rcv_reset+0x1d1/0x5b0 net/dccp/input.c:160
     dccp_rcv_state_process+0x8fc/0x1620 net/dccp/input.c:663
     dccp_v4_do_rcv+0xeb/0x160 net/dccp/ipv4.c:679
     sk_backlog_rcv include/net/sock.h:911 [inline]
     __sk_receive_skb+0x33e/0xc00 net/core/sock.c:521
     dccp_v4_rcv+0xef1/0x1c00 net/dccp/ipv4.c:871
     ip_local_deliver_finish+0x2e2/0xba0 net/ipv4/ip_input.c:216
     NF_HOOK include/linux/netfilter.h:248 [inline]
     ip_local_deliver+0x1ce/0x6d0 net/ipv4/ip_input.c:257
     dst_input include/net/dst.h:477 [inline]
     ip_rcv_finish+0x8db/0x19c0 net/ipv4/ip_input.c:397
     NF_HOOK include/linux/netfilter.h:248 [inline]
     ip_rcv+0xc3f/0x17d0 net/ipv4/ip_input.c:488
     __netif_receive_skb_core+0x19af/0x33d0 net/core/dev.c:4417
     __netif_receive_skb+0x2c/0x1b0 net/core/dev.c:4455
     process_backlog+0x203/0x740 net/core/dev.c:5130
     napi_poll net/core/dev.c:5527 [inline]
     net_rx_action+0x792/0x1910 net/core/dev.c:5593
     __do_softirq+0x2f5/0xba3 kernel/softirq.c:284
    
    The buggy address belongs to the object at ffff8801d2660100
     which belongs to the cache ccid2_hc_tx_sock of size 1240
    The buggy address is located 1088 bytes inside of
     1240-byte region [ffff8801d2660100, ffff8801d26605d8)
    The buggy address belongs to the page:
    page:ffffea0007499800 count:1 mapcount:0 mapping:ffff8801d2660100 index:0x0 compound_mapcount: 0
    flags: 0x200000000008100(slab|head)
    raw: 0200000000008100 ffff8801d2660100 0000000000000000 0000000100000005
    raw: ffffea00075271a0 ffffea0007538820 ffff8801d3aef9c0 0000000000000000
    page dumped because: kasan: bad access detected
    
    Memory state around the buggy address:
     ffff8801d2660400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
     ffff8801d2660480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    >ffff8801d2660500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                               ^
     ffff8801d2660580: fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc fc
     ffff8801d2660600: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    ==================================================================
    
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reported-by: Dmitry Vyukov <dvyukov@google.com>
    Cc: Gerrit Renker <gerrit@erg.abdn.ac.uk>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit a8de69b93e86255e079f0b2ef64b685217858284
Author: Eric Dumazet <edumazet@google.com>
Date:   Mon Aug 14 14:10:25 2017 -0700

    dccp: purge write queue in dccp_destroy_sock()
    
    
    [ Upstream commit 7749d4ff88d31b0be17c8683143135adaaadc6a7 ]
    
    syzkaller reported that DCCP could have a non empty
    write queue at dismantle time.
    
    WARNING: CPU: 1 PID: 2953 at net/core/stream.c:199 sk_stream_kill_queues+0x3ce/0x520 net/core/stream.c:199
    Kernel panic - not syncing: panic_on_warn set ...
    
    CPU: 1 PID: 2953 Comm: syz-executor0 Not tainted 4.13.0-rc4+ #2
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Call Trace:
     __dump_stack lib/dump_stack.c:16 [inline]
     dump_stack+0x194/0x257 lib/dump_stack.c:52
     panic+0x1e4/0x417 kernel/panic.c:180
     __warn+0x1c4/0x1d9 kernel/panic.c:541
     report_bug+0x211/0x2d0 lib/bug.c:183
     fixup_bug+0x40/0x90 arch/x86/kernel/traps.c:190
     do_trap_no_signal arch/x86/kernel/traps.c:224 [inline]
     do_trap+0x260/0x390 arch/x86/kernel/traps.c:273
     do_error_trap+0x120/0x390 arch/x86/kernel/traps.c:310
     do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:323
     invalid_op+0x1e/0x30 arch/x86/entry/entry_64.S:846
    RIP: 0010:sk_stream_kill_queues+0x3ce/0x520 net/core/stream.c:199
    RSP: 0018:ffff8801d182f108 EFLAGS: 00010297
    RAX: ffff8801d1144140 RBX: ffff8801d13cb280 RCX: 0000000000000000
    RDX: 0000000000000000 RSI: ffffffff85137b00 RDI: ffff8801d13cb280
    RBP: ffff8801d182f148 R08: 0000000000000001 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801d13cb4d0
    R13: ffff8801d13cb3b8 R14: ffff8801d13cb300 R15: ffff8801d13cb3b8
     inet_csk_destroy_sock+0x175/0x3f0 net/ipv4/inet_connection_sock.c:835
     dccp_close+0x84d/0xc10 net/dccp/proto.c:1067
     inet_release+0xed/0x1c0 net/ipv4/af_inet.c:425
     sock_release+0x8d/0x1e0 net/socket.c:597
     sock_close+0x16/0x20 net/socket.c:1126
     __fput+0x327/0x7e0 fs/file_table.c:210
     ____fput+0x15/0x20 fs/file_table.c:246
     task_work_run+0x18a/0x260 kernel/task_work.c:116
     exit_task_work include/linux/task_work.h:21 [inline]
     do_exit+0xa32/0x1b10 kernel/exit.c:865
     do_group_exit+0x149/0x400 kernel/exit.c:969
     get_signal+0x7e8/0x17e0 kernel/signal.c:2330
     do_signal+0x94/0x1ee0 arch/x86/kernel/signal.c:808
     exit_to_usermode_loop+0x21c/0x2d0 arch/x86/entry/common.c:157
     prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
     syscall_return_slowpath+0x3a7/0x450 arch/x86/entry/common.c:263
    
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reported-by: Dmitry Vyukov <dvyukov@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 94fd355614e33126053192d89f10d1fff18023c8
Author: Eric Dumazet <edumazet@google.com>
Date:   Mon Aug 14 10:16:45 2017 -0700

    af_key: do not use GFP_KERNEL in atomic contexts
    
    
    [ Upstream commit 36f41f8fc6d8aa9f8c9072d66ff7cf9055f5e69b ]
    
    pfkey_broadcast() might be called from non process contexts,
    we can not use GFP_KERNEL in these cases [1].
    
    This patch partially reverts commit ba51b6be38c1 ("net: Fix RCU splat in
    af_key"), only keeping the GFP_ATOMIC forcing under rcu_read_lock()
    section.
    
    [1] : syzkaller reported :
    
    in_atomic(): 1, irqs_disabled(): 0, pid: 2932, name: syzkaller183439
    3 locks held by syzkaller183439/2932:
     #0:  (&net->xfrm.xfrm_cfg_mutex){+.+.+.}, at: [<ffffffff83b43888>] pfkey_sendmsg+0x4c8/0x9f0 net/key/af_key.c:3649
     #1:  (&pfk->dump_lock){+.+.+.}, at: [<ffffffff83b467f6>] pfkey_do_dump+0x76/0x3f0 net/key/af_key.c:293
     #2:  (&(&net->xfrm.xfrm_policy_lock)->rlock){+...+.}, at: [<ffffffff83957632>] spin_lock_bh include/linux/spinlock.h:304 [inline]
     #2:  (&(&net->xfrm.xfrm_policy_lock)->rlock){+...+.}, at: [<ffffffff83957632>] xfrm_policy_walk+0x192/0xa30 net/xfrm/xfrm_policy.c:1028
    CPU: 0 PID: 2932 Comm: syzkaller183439 Not tainted 4.13.0-rc4+ #24
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Call Trace:
     __dump_stack lib/dump_stack.c:16 [inline]
     dump_stack+0x194/0x257 lib/dump_stack.c:52
     ___might_sleep+0x2b2/0x470 kernel/sched/core.c:5994
     __might_sleep+0x95/0x190 kernel/sched/core.c:5947
     slab_pre_alloc_hook mm/slab.h:416 [inline]
     slab_alloc mm/slab.c:3383 [inline]
     kmem_cache_alloc+0x24b/0x6e0 mm/slab.c:3559
     skb_clone+0x1a0/0x400 net/core/skbuff.c:1037
     pfkey_broadcast_one+0x4b2/0x6f0 net/key/af_key.c:207
     pfkey_broadcast+0x4ba/0x770 net/key/af_key.c:281
     dump_sp+0x3d6/0x500 net/key/af_key.c:2685
     xfrm_policy_walk+0x2f1/0xa30 net/xfrm/xfrm_policy.c:1042
     pfkey_dump_sp+0x42/0x50 net/key/af_key.c:2695
     pfkey_do_dump+0xaa/0x3f0 net/key/af_key.c:299
     pfkey_spddump+0x1a0/0x210 net/key/af_key.c:2722
     pfkey_process+0x606/0x710 net/key/af_key.c:2814
     pfkey_sendmsg+0x4d6/0x9f0 net/key/af_key.c:3650
    sock_sendmsg_nosec net/socket.c:633 [inline]
     sock_sendmsg+0xca/0x110 net/socket.c:643
     ___sys_sendmsg+0x755/0x890 net/socket.c:2035
     __sys_sendmsg+0xe5/0x210 net/socket.c:2069
     SYSC_sendmsg net/socket.c:2080 [inline]
     SyS_sendmsg+0x2d/0x50 net/socket.c:2076
     entry_SYSCALL_64_fastpath+0x1f/0xbe
    RIP: 0033:0x445d79
    RSP: 002b:00007f32447c1dc8 EFLAGS: 00000202 ORIG_RAX: 000000000000002e
    RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000445d79
    RDX: 0000000000000000 RSI: 000000002023dfc8 RDI: 0000000000000008
    RBP: 0000000000000086 R08: 00007f32447c2700 R09: 00007f32447c2700
    R10: 00007f32447c2700 R11: 0000000000000202 R12: 0000000000000000
    R13: 00007ffe33edec4f R14: 00007f32447c29c0 R15: 0000000000000000
    
    Fixes: ba51b6be38c1 ("net: Fix RCU splat in af_key")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reported-by: Dmitry Vyukov <dvyukov@google.com>
    Cc: David Ahern <dsa@cumulusnetworks.com>
    Acked-by: David Ahern <dsahern@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 72942014297947aed30d60eceb5fa352f7e1673c
Author: Andreas Born <futur.andy@googlemail.com>
Date:   Sat Aug 12 00:36:55 2017 +0200

    bonding: ratelimit failed speed/duplex update warning
    
    
    [ Upstream commit 11e9d7829dd08dbafb24517fe922f11c3a8a9dc2 ]
    
    bond_miimon_commit() handles the UP transition for each slave of a bond
    in the case of MII. It is triggered 10 times per second for the default
    MII Polling interval of 100ms. For device drivers that do not implement
    __ethtool_get_link_ksettings() the call to bond_update_speed_duplex()
    fails persistently while the MII status could remain UP. That is, in
    this and other cases where the speed/duplex update keeps failing over a
    longer period of time while the MII state is UP, a warning is printed
    every MII polling interval.
    
    To address these excessive warnings net_ratelimit() should be used.
    Printing a warning once would not be sufficient since the call to
    bond_update_speed_duplex() could recover to succeed and fail again
    later. In that case there would be no new indication what went wrong.
    
    Fixes: b5bf0f5b16b9c (bonding: correctly update link status during mii-commit phase)
    Signed-off-by: Andreas Born <futur.andy@googlemail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit b39ae1c8bdc4e4b55ed84c4350ea38db72724512
Author: Andreas Born <futur.andy@googlemail.com>
Date:   Thu Aug 10 06:41:44 2017 +0200

    bonding: require speed/duplex only for 802.3ad, alb and tlb
    
    
    [ Upstream commit ad729bc9acfb7c47112964b4877ef5404578ed13 ]
    
    The patch c4adfc822bf5 ("bonding: make speed, duplex setting consistent
    with link state") puts the link state to down if
    bond_update_speed_duplex() cannot retrieve speed and duplex settings.
    Assumably the patch was written with 802.3ad mode in mind which relies
    on link speed/duplex settings. For other modes like active-backup these
    settings are not required. Thus, only for these other modes, this patch
    reintroduces support for slaves that do not support reporting speed or
    duplex such as wireless devices. This fixes the regression reported in
    bug 196547 (https://bugzilla.kernel.org/show_bug.cgi?id=196547).
    
    Fixes: c4adfc822bf5 ("bonding: make speed, duplex setting consistent
    with link state")
    Signed-off-by: Andreas Born <futur.andy@googlemail.com>
    Acked-by: Mahesh Bandewar <maheshb@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 16caf8dff7ee7d93a12ad61253cffbcb414e823f
Author: Tushar Dave <tushar.n.dave@oracle.com>
Date:   Wed Aug 16 11:09:10 2017 -0700

    sparc64: remove unnecessary log message
    
    [ Upstream commit 6170a506899aee3dd4934c928426505e47b1b466 ]
    
    There is no need to log message if ATU hvapi couldn't get register.
    Unlike PCI hvapi, ATU hvapi registration failure is not hard error.
    Even if ATU hvapi registration fails (on system with ATU or without
    ATU) system continues with legacy IOMMU. So only log message when
    ATU hvapi successfully get registered.
    
    Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>