Bug 907193 - Error on boot: "ata2.00: failed to enable AA (error_mask=0x1)"
Summary: Error on boot: "ata2.00: failed to enable AA (error_mask=0x1)"
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 20
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 982053 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-02-03 17:26 UTC by jg.macia
Modified: 2014-07-22 23:29 UTC (History)
18 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 982053 (view as bug list)
Environment:
Last Closed: 2014-03-04 14:24:05 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
dmesg and smartctl output (6.39 KB, text/plain)
2013-11-25 14:28 UTC, Nicholas Little
no flags Details
Updated patch for 3.10.17 (562 bytes, patch)
2013-11-25 16:03 UTC, Nicholas Little
no flags Details | Diff
dmesg and smartcl output IdeaPad Y410p Laptop (7.15 KB, text/plain)
2014-03-04 07:09 UTC, Guilherme Amadio
no flags Details
smartctl and dmesg output on a HP Proliant g7 amd turion (15.05 KB, text/plain)
2014-07-07 17:02 UTC, Adriano
no flags Details

Description jg.macia 2013-02-03 17:26:02 UTC
Description of problem:
I get the following error on boot or when starting after suspension (in this last cases, errors sum to the ones occurred on boot):
[    2.555905] ata2.00: failed to enable AA (error_mask=0x1)
[    2.568482] ata2.00: failed to enable AA (error_mask=0x1)
Sometimes system freezes after the error, and a restart is necessary.

Version-Release number of selected component (if applicable):


How reproducible:
Always on boot or after suspension, the system freezes only sometimes (ca. 10-20% of the times)

Steps to Reproduce:
1. Boot computer or
2. Start after suspension
  
Actual results:
Previously described error occurs.

Expected results:
Absence of error and not freezing on computer starting

Additional info:
Error started on F17 after upgrade from F16, and is persistent in F18.

Comment 1 Nirmal Pathak 2013-06-20 18:55:24 UTC
This is what I get on my new Lenovo Ideapad Z580 at boot time.

ata1.00: failed to enable AA (error_mask = 0x1)
ata1.00: failed to enable AA (error_mask = 0x1)

Here's my output of 'smartctl' - http://fpaste.org/19779/

Comment 2 de Sablet 2013-06-25 13:26:03 UTC
I get the same error on my new Lenovo Ideapad G580
ata1.00: failed to enable AA (error_mask = 0x1)
ata1.00: failed to enable AA (error_mask = 0x1)

Comment 3 Justin M. Forbes 2013-10-18 21:12:24 UTC
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 18 kernel bugs.

Fedora 18 has now been rebased to 3.11.4-101.fc18.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you have moved on to Fedora 19, and are still experiencing this issue, please change the version to Fedora 19.

If you experience different issues, please open a new bug report for those.

Comment 4 Andreas Thienemann 2013-11-05 10:28:30 UTC
Still an issue on 3.11.6-302.fc20.x86_64 from F20-Alpha

Comment 5 Michele Baldessari 2013-11-16 22:56:24 UTC
A couple of comments.

The error comes from drivers/ata/libata-core.c:ata_dev_config_ncq()
if (!(dev->horkage & ATA_HORKAGE_BROKEN_FPDMA_AA) &&                    
        (ap->flags & ATA_FLAG_FPDMA_AA) &&                              
        ata_id_has_fpdma_aa(dev->id)) {                                 
        err_mask = ata_dev_set_feature(dev, SETFEATURES_SATA_ENABLE,    
                SATA_FPDMA_AA);                                         
        if (err_mask) {                                                 
                ata_dev_err(dev,                                        
                            "failed to enable AA (error_mask=0x%x)\n",  
                            err_mask);                                  
                if (err_mask != AC_ERR_DEV) {                           
                        dev->horkage |= ATA_HORKAGE_BROKEN_FPDMA_AA;    
                        return -EIO;                                    
                }                                                       
        } else                                                          
                aa_desc = ", AA";                                       
}                                                                       

So it seems that the drive somehow advertises the ATA_FLAG_FPDMA_AA 
flag and ata_id_has_fpdma_aa(dev->id) returns true:
#define ata_id_has_fpdma_aa(id) \                          
        ((((id)[ATA_ID_SATA_CAPABILITY] != 0x0000) && \
          ((id)[ATA_ID_SATA_CAPABILITY] != 0xffff)) && \
         ((id)[ATA_ID_FEATURE_SUPP] & (1 << 2)))

So this drive advertises this FPDMA FIS Auto Activate bit 
(http://download.intel.com/support/chipsets/imsm/sb/sata2_ncq_overview.pdf)
but when we try to set this feature it returns error.

So far I see four people having the issue on this BZ.
Could we all confirm the following:
- Is everyone on Lenovo Ideapads?
- Can everyone attach the output of 'smartctl -a /dev/sda' to this case?

If we see that everyone has the same drive (I see Nirmal has posted that info here
https://ask.fedoraproject.org/question/26877/error-ata100-failed-to-enable-aa-at-boot-time-on-fedora-18/ 
and has the following drive: http://samsunghdd.seagate.com/includes/spinpoint-m8-ds.pdf), we could try to
black list the feature via ATA_HORKAGE_BROKEN_FPDMA_AA for this specific drive.

Disabling NCQ would also work, but would kill performance.

Thanks,
Michele

Comment 6 Michele Baldessari 2013-11-16 23:34:49 UTC
*** Bug 982053 has been marked as a duplicate of this bug. ***

Comment 7 Nicholas Little 2013-11-25 14:28:00 UTC
Created attachment 828707 [details]
dmesg and smartctl output

Hi, I'm on an Ideapad z500 with kernel 3.10.17 (it's a gentoo system) and experiencing this error message on startup.

I've attached my dmesg output and smartctl response for the drive.

Hope this helps.

Thanks.

Comment 8 Michele Baldessari 2013-11-25 15:24:53 UTC
Thanks Nicholas. Can you try the following and report back?
diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 81a94a3..f6b0892 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -4156,6 +4156,9 @@ static const struct ata_blacklist_entry ata_device_blacklist [] = {
        { "ST3320[68]13AS",     "SD1[5-9]",     ATA_HORKAGE_NONCQ |
                                                ATA_HORKAGE_FIRMWARE_WARN },
 
+       /* Seagate Momentus SpinPoint M8 seem to have FPMDA_AA issues */
+       { "ST1000LM024 HN-M101MBB", "2AR10001", ATA_HORKAGE_BROKEN_FPDMA_AA },
+
        /* Blacklist entries taken from Silicon Image 3124/3132
           Windows driver .inf file - also several Linux problem reports */
        { "HTS541060G9SA00",    "MB3OC60D",     ATA_HORKAGE_NONCQ, },


Thanks,
Michele

Comment 9 Nicholas Little 2013-11-25 16:03:31 UTC
Created attachment 828750 [details]
Updated patch for 3.10.17

Hi Michelle, thanks for such a quick response!

My boot screen is now free from kernel error messages :)

I had to modify the patch a little, I'm assuming you made the diff on a f20 kernel, possibly in the 3.11.x series. Perhaps it's the version difference, perhaps there's some gentoo patches making modifications in that area of the file, I'm not sure, but I've attached the modified version-perhaps it's useful for you to backport the change to f19 assuming it's not EOL.

I'm hoping if this gets accepted it'll make it upstream at some point so gentoo will eventually get it too?

Comment 10 Michele Baldessari 2013-11-25 19:02:20 UTC
Hi Nicholas,

thanks for the feedback. Yes the patch was against current git HEAD from Linus
so some tweaks were expected. I've submitted the patch upstream. I'll put here
an update with the status.

regards,
Michele

Comment 11 Michele Baldessari 2013-11-30 08:49:45 UTC
This has been now applied to libata/for-3.13-fixes w/ stable cc'd

Can be moved to POST

Comment 12 Justin M. Forbes 2014-02-24 13:58:09 UTC
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 20 kernel bugs.

Fedora 20 has now been rebased to 3.13.4-200.fc20.  Please test this kernel update and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you experience different issues, please open a new bug report for those.

Comment 13 Guilherme Amadio 2014-03-04 05:29:24 UTC
Hi,

I just got an IdeaPad Y410p and I get the same error even after the patch (the hdd model is the same as the others). Can someone post their kernel configuration so that I can make sure that I have it correctly?

Thanks,
Guilherme

Comment 14 Guilherme Amadio 2014-03-04 07:09:31 UTC
Created attachment 870267 [details]
dmesg and smartcl output IdeaPad Y410p Laptop

I noticed that my firmware is different than what is on the previous attachment.

Comment 15 Nicholas Little 2014-03-04 11:12:16 UTC
(In reply to Guilherme Amadio from comment #13)
> Hi,
> 
> I just got an IdeaPad Y410p and I get the same error even after the patch
> (the hdd model is the same as the others). Can someone post their kernel
> configuration so that I can make sure that I have it correctly?
> 
> Thanks,
> Guilherme

I think kernel configuration won't make a difference here.

(In reply to Guilherme Amadio from comment #14)
> Created attachment 870267 [details]
> dmesg and smartcl output IdeaPad Y410p Laptop
> 
> I noticed that my firmware is different than what is on the previous
> attachment.

I'm assuming if you modify the patch to include the firmware version of your drive then the error message goes away?

Comment 16 Guilherme Amadio 2014-03-04 14:02:18 UTC
Thanks for politely bringing that up. When I noticed the firmware mismatch I spent hours trying to download the firmware update from Seagate and completely missed the fact that the firmware number was on the patch. Now I don't get any errors either.

Best regards,
Guilherme

Comment 17 Nicholas Little 2014-03-04 14:28:23 UTC
Great!

If Michele is still watching could we add this firmware revision as an additional patch?

{ "ST1000LM024 HN-M101MBB", "2BA30001", ATA_HORKAGE_BROKEN_FPDMA_AA }

Might be useful to wait a short while to see if any other users of that version firmware report in.

I don't really know anything about the FPDMA_AA feature or what it does, but if it could be fixed by a firmware update then we don't want all revisions blacklisted, I hope that's the case :)

Of course, if it's a hardware issue then blacklisting by drive model might be the correct solution, if that's possible.

Josh, can we say this is fixed without adding a patch for Guilherme's firmware revision?

Thanks!

Comment 18 Josh Boyer 2014-03-04 14:46:48 UTC
Yes, this can remain closed.  If another patch is needed for a different firmware/machine, we can still do that and if people want to track it we can open a new bug.  Otherwise this will just remain open indefinitely for as many iterations as it takes.  We try to stick to one fix/patch/bug per issue.  Thanks.

Comment 19 Michele Baldessari 2014-03-07 11:16:52 UTC
Hi Nicholas,

can you file a new BZ with all the infos and assign it to me please? I'll then make sure I'll push things upstream

thanks,
Michele

Comment 20 Nicholas Little 2014-03-07 13:18:11 UTC
Thanks Michelle, I couldn't actually assign it to you myself. But I selected the correct component so you should get a notification ;)

Comment 21 Laurent D 2014-03-10 17:18:39 UTC
Hi,

I recently got an IdeaPad Z710, and I have the same issue now.
Since I'm quite a begginner in the booting and Kernel stuff, can anyone try to help me solving this (if there is any solution of course) ?

I'm running under F20.

Thanks in advance !

Laurent

Comment 22 Nicholas Little 2014-03-10 17:27:49 UTC
(In reply to Laurent D from comment #21)
> Hi,
> 
> I recently got an IdeaPad Z710, and I have the same issue now.
> Since I'm quite a begginner in the booting and Kernel stuff, can anyone try
> to help me solving this (if there is any solution of course) ?
> 
> I'm running under F20.
> 
> Thanks in advance !
> 
> Laurent

Instructions for getting your drive's information are earlier in this thread.

What drive firmware do you have? If it's 2BA30001 then please add your details from the instructions here to 
https://bugzilla.redhat.com/show_bug.cgi?id=1073901. If it's different then could you file a new bug please?

Many thanks.

Comment 23 Laurent D 2014-03-10 17:44:02 UTC
Hi,

Yeah, indeed, mine is 2BA30001. As I told you, I'm new in this. Trying this patch and I keep you updated.

Thanks for this answer !

Laurent

Comment 24 Laurent D 2014-03-10 18:40:46 UTC
Well, now I feel quite stupid, I can't locate the file
drivers/ata/libata-core.c
to witch I guess I have to add the two following lines

    /* Seagate Momentus SpinPoint M8 seem to have FPMDA_AA issues */
    { "ST1000LM024 HN-M101MBB", "2AR10001", ATA_HORKAGE_BROKEN_FPDMA_AA },

Modifying 2AR10001 by 2AR30001 I suppose ?

Do I have to mount any filesystem to access this file in a different partition ?

Sorry for disturbing, and thanks in advance.

Laurent

Comment 25 Guilherme Amadio 2014-03-10 19:07:45 UTC
Hi Laurent,

You most likely do not have that file if you are using a binary distribution like Fedora. What you need to do is to download the source code for the linux kernel, which are usually stored under /usr/src/linux, and then you will need to modify the file /usr/src/linux/drivers/ata/libata-core.c (i.e. find the lines with the other firmware numbers and add yours accordinly). Then you need to configure and compile the kernel for your machine, set your bootloader to use the new kernel, and see if the problem goes away. I use Gentoo, so I cannot give you instructions on how to use yum to get the kernel sources, but you can go to kernel.org and download the vanilla sources directly. You then unpack it into /usr/src/linux and do what I describe above. There are many tutorials on how to compile a kernel on the internet. Here's one for Arch Linux: https://wiki.archlinux.org/index.php/Kernels/Compilation/Traditional
Ah, you may have access to the config of the current running kernel on your system via /proc/config.gz, maybe that will save you a lot of time. If that file exists, you can use zcat /proc/config.gz > /usr/src/linux/.config to copy it to the kernel you want to compile. Just make sure to use a similar version of the kernel, since options may change from a release to the other.
Well, I hope that this helps.

Best,
—Guilherme

Comment 26 Laurent D 2014-03-10 21:02:31 UTC
Thanks,

I've read that the patch has been added to the 3.14 kernel version, can I simply take this version for my kernel ? Or is it still a very unstable one ?

And can you briefly explain me why this error suddently happen without any change ? (last session on linux, I didn't change anything nor updated anything ...)

Thanks again.

Laurent

Comment 27 Guilherme Amadio 2014-03-10 21:14:00 UTC
I took a look at the current 3.14-rc6 tree and the patch is not there:

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/ata/libata-core.c?id=refs/tags/v3.14-rc6#n4176

The only firmwares currently patched for are 2AR10001 and 2BA30001. Your firmware, 2AR30001, is not listed for now. So you need to download the kernel and change it yourself, since you only need to change a single line. I'd use the current stable version, 3.13.6 instead of the current release candidate.

I don't understand your second question, though. What do you mean by the error suddenly happening without any changes? You may have only noticed it now, or, if you recently upgraded your kernel, something changed in the kernel that makes the patch necessary. I hope that answers your question.

Best,
—Guilherme

Comment 28 Guilherme Amadio 2014-03-10 21:30:01 UTC
Since more firmwares are affected, I just filed a new bug for this at the kernel bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=71821

—Guilherme

Comment 29 Laurent D 2014-03-10 21:36:26 UTC
I made a mistake in one of my recent post. My firmware is 2BA30001 as I first wrote, and not 2AR30001. 

Thanks for your answer.

Comment 30 thomasnaake 2014-04-21 19:53:54 UTC
Hi,
I am having problems as well:
[182952.110858] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[182952.117455] ata1.00: failed to enable AA (error_mask=0x1)
[182952.124126] ata1.00: failed to enable AA (error_mask=0x1)
[182952.124133] ata1.00: configured for UDMA/100

But I haven't understood the things you've talked about for I am a Fedora novice. I'm running a Lenovo Thinkpad Z580 under F19. 

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Momentus SpinPoint M8 (AF)
Device Model:     ST750LM022 HN-M750MBB
Serial Number:    S2USJ9KCA00749
LU WWN Device Id: 5 0004cf 20887bdaf
Firmware Version: 2AR10001
User Capacity:    750,156,374,016 bytes [750 GB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS T13/1699-D revision 6
SATA Version is:  SATA 3.0, 3.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Mon Apr 21 20:46:16 2014 BST
SMART support is: Available - device has SMART capability.

Is anybody able to give an advice on solving the problem?

Comment 31 Adriano 2014-07-07 17:02:29 UTC
Created attachment 916138 [details]
smartctl and dmesg output on a HP Proliant g7 amd turion

I just want to make clear that such problem affects machines other than Lenovo.

Comment 32 Adriano 2014-07-07 17:06:40 UTC
Hello people,

I'm not sure I understand all you discussed in this thread but I'm sure that my machine is NOT a Lenovo and it IS affected indeed.

I'm running Fedora20 64 on a HP Proliant Microserver g7 AMD turion.

I added an attachment with output from smartctl AND from dmesg.

Both the drives in my raid 0 array are Seagate.

The kernel I'm currently running is 3.14.5-200

Comment 33 Adriano 2014-07-07 17:09:24 UTC
I noticed right now that the output of dmesg in my case is different: it' s

 
[adriano@server1 ~]$ dmesg | grep "failed to enable" 
[    2.282599] ata1.00: failed to enable AA (error_mask=0x1)
[    2.283502] ata1.00: failed to enable AA (error_mask=0x1)

It's ata1.00 and not ata2.00

Comment 34 Michele Baldessari 2014-07-08 10:36:19 UTC
Adriano,

please create a fresh new bugzilla with smartctl output of all the drives and full unfiltered dmesg. We will likely need to add another quirk for your
type of drives.

thanks,
Michele

Comment 35 Adriano 2014-07-22 23:29:10 UTC
Michele,

thank you. I did, it's here https://bugzilla.redhat.com/show_bug.cgi?id=1121668


Note You need to log in before you can comment on or make changes to this bug.