19:05 * marga Margarita Manterola 19:05 * ntyni Niko Tyni 19:06 * smcv Simon McVittie 19:06 * gwolf Gunnar Wolf 19:07 < gwolf> I guess we can feed the logs to the meetbot processor or so... 19:07 * gwolf pingalls ctte.. 19:07 <@marga> OdyX said he would likely miss this session... 19:07 <@marga> fil, ? 19:07 <@marga> And the rest of our members seem to not be on the channel... 19:07 < smcv> we've all got so used to the meetings all being cancelled 19:08 <@marga> Alright, let's move to our first topic 19:08 < smcv> it's typical, you wait 3 months for a bus^Wagenda item and three come along at once 19:08 -!- marga changed the topic of #debian-ctte to: #904302 Whether vendor-specific patch series should be permitted in the archive 19:09 <@marga> I think we basically have consensus on this one. 19:09 <@marga> I'm sad that Tollef isn't here, as he seemed to have volunteered to write the proposal for the vote. 19:09 < gwolf> I agree. There are no voices against from my recollection in the list 19:10 <@marga> I think the next steps would be: 1) draft a statement 2) vote on the statement 3) close the issue 19:10 < ntyni> marga's message today pretty much matched my thoughts fwiw, so ack on consensus 19:10 < smcv> marga's mail to the bug says everything I would have said 19:10 <@marga> Thanks :) 19:10 < gwolf> there was one DD arguing against our probable outcome, but I understand someone (fil?) spoke with him at DC18 and basically come to an OK 19:10 <@marga> Thanks smcv for your thorough analysis, I really appreciated that. 19:11 <@marga> Maybe bremner? 19:11 < gwolf> maybe. Don't remember. 19:11 < smcv> hah for a moment I totally forgot that I'd mailed the bug and thought you were endorsing "I agree with marga" as a thorough analysis :-) 19:11 < gwolf> But anyway - I think there is consensus within ctte, and don't think the consensus will be contentious outside of our little group :) 19:12 <@marga> Alright, so, shall we add an action item for Tollef to do the writeup as he seems to have volunteered? 19:12 < ntyni> sure :) 19:12 < gwolf> I'm OK with volunteering Tollef :) In case he cannot or won't do it, I can take the task. 19:12 < smcv> sounds good 19:13 <@marga> #action Mithrandir to draft the resolution so that we can vote on it. gwolf can take it if that's not ok. 19:13 -!- marga changed the topic of #debian-ctte to: #904558 What should happen when maintscripts fail to restart a service 19:13 -!- Mithrandir has joined #debian-ctte 19:14 <@marga> \o/ 19:14 < Mithrandir> Hey 19:14 < gwolf> This is way less clear cut IMO... And not having any follow-up kind-of-supports it 19:14 < gwolf> Mithrandir: !! 19:14 < Mithrandir> on phone, so not terribly useful 19:14 <@marga> Mithrandir, we just actioned you in the previous topic: 19:14 <@marga> #action Mithrandir to draft the resolution so that we can vote on it. gwolf can take it if that's not ok. 19:14 < gwolf> Mithrandir: (re:#904302) 19:14 <@marga> (this is the series.vendor issue) 19:14 < Mithrandir> yup 19:14 < Mithrandir> efm 19:14 < Mithrandir> wfm 19:14 <@marga> Awesome. 19:15 < gwolf> great, I'm off the hook ;-) 19:15 < Mithrandir> Does anybody have opinions? I think mine is clear 19:15 <@marga> For this topic, we had less traffic, but I agree with Tollef's mail from last week. i.e. We could recommend good behavior but we shouldn't dictate anything. 19:16 < smcv> I'm curious what you think the good behaviour is 19:16 < ntyni> Mithrandir: in case it's not obvious, we've moved onto #904558 already 19:16 < gwolf> I think Mithrandir's suggestion is clear - But we have to decide what the recommended one would be - Fail open? Fail closed? 19:16 < gwolf> (best is not to fail, of course) 19:16 <@marga> Oh, I do || true on basically all statements I add to a maintscript :) 19:17 <@marga> Failing maintscripts are really a horrible nightmare 19:17 < Mithrandir> ntyni: yup, got that 19:17 < gwolf> marga: the bug is particularly about restarting services, so I guess failing to mkdir or so does not have to follow this. 19:17 <@marga> I know, I'm just saying I might be a bit of an extremist in the "maintscripts shouldn't fail" camp 19:18 < Mithrandir> i kinda lean towards fail early, but balancing that is failing maintscripts tend to leave you at the bottom of a deep hole, so… 19:18 < gwolf> But... There are issues™ in whatever direction we decide to follow. i.e. failing to restart a daemon on a package update will leave the user running a potentially buggy version even having a fixed one installed... 19:18 < smcv> I wonder whether distinguishing between stop;start and "reload harder"/re-exec is useful 19:18 < smcv> otoh systemctl restart always means stop;start 19:19 < gwolf> Yes, failing maintscripts is a terrible headache for users. It often involves editting the maintscript in question, not what I expect just-about-anybody to be able to do 19:19 < Mithrandir> i think being consistent is more important than whether the default is stop or not 19:20 < gwolf> smcv: oh, right - Restart means stop/start. And if the daemon was not running to begin with (due to the user having manually stopped it, say), a restart will leave it running. 19:20 < Mithrandir> gwolf: we expect it to be: fix the daemon startup , dpkg —configure in this case 19:20 < smcv> if the maintscript fails I suppose the key question is what the user is going to do about it 19:20 < Mithrandir> so no maintscript hacking 19:21 < smcv> and whether what they do about it differs from what they'd do if it suppressed the failure and left the package configured-but-non-functional 19:21 < smcv> (with a big fat warning) 19:21 <@marga> One of the problems with failing maintscripts is that it's usually very hard to understand what's failing. Even if you don't need to edit the maintscript itself, you may need to go read it to understand where it failed. 19:21 <@marga> smcv, where would the warning be visible? 19:21 < gwolf> marga: yes, and that's something that cannot be expected from users in general. Not even knowing where the scripts live 19:21 < Mithrandir> Or just rerun dpkg to get rid of the chaff 19:22 < smcv> marga: insert recurring wish for useful logging here 19:22 <@marga> :) 19:22 < Mithrandir> so, I’m somewhat worried that ignoring failures leads to reboot, then a failed service 19:23 < gwolf> ...I "smell" that this might be too broad of an issue for us to rule 19:23 < smcv> and not ignoring failures leads to what? 19:23 <@marga> Uhm, I'm not sure I follow. It could also lead to "failed service -> reboot -> working service", depending on the failure 19:23 < smcv> a failed service, a failed dpkg and a confused sysadmin? 19:23 < Mithrandir> in the Debian spirit, can we have a default that is overridable and then we just have to choose the default. 19:24 <@marga> "Maintscripts should try as hard as possible not to fail"? 19:24 < Mithrandir> gwolf: we are asked to advise, not decide. 19:24 < smcv> Mithrandir: only if it doesn't require us to inject yet more shell script complexity into maintscripts 19:24 < gwolf> right 19:24 < gwolf> marga: unset -e 19:24 <@marga> Yeah 19:25 <@marga> "And if they do fail, they should output an actionalbe message to the user" 19:25 < gwolf> ...I would not like that, though... 19:25 < ntyni> it would be nice to have a generic way to 'fail gracefully' and inform the user that a daemon failed to start during configure 19:25 < gwolf> The question is not about the general initscripts flow, but about service restarts 19:26 < ntyni> I'm thinking of something like a dpkg trigger 19:26 < Mithrandir> Can we do something good here without going into detailed design? 19:26 <@marga> gwolf, I'm not sure it's a separate question, really 19:26 < smcv> perhaps relevant: 19:26 < smcv> why is it so important that we restart services? 19:26 < gwolf> marga: it stems/flows from the specific question, right... But we are asked to advise specifically on what happens in service restarts 19:26 < smcv> answer: if they have security vulns then the old version is bad 19:27 < smcv> but if libssl or libldap or libdbus has security vulns then we don't restart the world 19:27 < Mithrandir> smcv: config updates, ensure new version works 19:27 < gwolf> smcv: Also, if there are behavioural changes (i.e. upstream package update), you want the running code to match what you have in disk 19:27 <@marga> The problem is that "the restart operation fails" already has two possible options of failure: stop failed or start failed. 19:28 < Mithrandir> marga: stop failed happens a lot less frequently with systemd though 19:28 <@marga> Yeah, I guess systemd will take care of making it happen one way or the other 19:28 < smcv> service manager in "quite good at stopping services" shock 19:28 < Mithrandir> I need to go again, I’d like to finish this on email, but please do continue the discussion:-) 19:29 < Mithrandir> feel free to highlight me if there is something in particular I can be useful on 19:30 <@marga> So, let's assume that the service did a restart, the stop succeeded but the start failed... What's to gain from the maintscript failing? 19:30 < smcv> so the wider context here is that the submitter of #780403 agrees with marga that things should fail less hard 19:31 < gwolf> marga: right, it's IMO always better to gracefully finish the install than to have a failed maintscript 19:31 < smcv> #780403 is actually about start, not restart, btw 19:31 < gwolf> Even though that will annoy the sysadmin as it results in a dead service 19:32 <@marga> Yeah, it's not that much different anyway 19:32 < smcv> daniel pocock makes an interesting point in the merged bug 802501 (which is about restarts) that if a service is taking down the daemon to do some offline reconfiguration, 19:33 < smcv> it can stop it in the preinst, which (unlike the postinst) can abort installation without leaving the system in an undefined state 19:34 < smcv> I feel as though that's a really rare case though 19:34 <@marga> Yeah, preinst failing is a different story than postinst failing 19:35 < gwolf> and preinst should be quite more limited in scope than postinst fwiw 19:36 <@marga> And really, I personally see no gain in postinst failing. Is there any gain at all? 19:36 < smcv> stopping in preinst means you have to communicate to the postinst that it's ok to start the thing, though, if you want the overall effect to be like systemctl try-restart (which is "stop, then start if it was previously running") 19:38 < smcv> obligatory controversial opinion: maybe we should be more prepared to require a reboot, and less keen to do surgery while the patient is still awake 19:38 < ntyni> marga: the gain is making sure that the admin notices the daemon failure? 19:38 <@marga> :-/ 19:38 <@marga> I guess that shows how bad things are regarding surfacing problems 19:39 < smcv> ntyni: "surprise! your packages are in an undefined state" is not necessarily such a constructive way to signal that? 19:39 < ntyni> it sure isn't 19:39 < gwolf> ntyni: I agree with marga's and smcv's feeling. Of course it grabs the admin's attention. But not for the better! 19:41 < ntyni> I'm not saying it's good practice, just saying that's the only gain I see 19:41 < smcv> sure 19:41 < gwolf> right... But I think we could then unanymously say it's bad to stop in a state the package manager is confused..? 19:42 < gwolf> It's better that the admin notices when foobard is not answering in its usual port... 19:42 < smcv> shouting about it on stderr and in syslog/Journal is always good of course 19:43 <@marga> The more I think about this, the more I think this is a remnant of old times. We need a better way to communicate to the user that something is not right 19:43 < smcv> systemd seems quite good at making a lot of noise when it can't do what you asked it to 19:43 < ntyni> marga: agreed 19:43 < gwolf> marga: "shouting" is not necessary, but "signaling our init so that it complains" (i.e. "degraded") should be enough 19:44 < smcv> if we assume systemd for a moment 19:44 < smcv> is there a way that daemons *can* fail to restart without it logging that fact? 19:44 <@marga> Not that I'm aware 19:44 < gwolf> Right. I'm talking about my recent experiences :-] But... Logging, stdout, falied return codes upon invocation... 19:44 < smcv> I think it'll always log "systemd[1]: Failed to start whatever." 19:44 < ntyni> maybe we should have an apt hook to run 'systemctl status' after upgrades 19:45 < gwolf> whatever init system you choose, those are the main communication methods 19:45 < ntyni> or something like that 19:45 * gwolf hopes ntyni is joking 19:45 <@marga> It wouldn't help on unattended upgrades, which I expect is the majority of upgrades nowadays 19:45 < ntyni> sure 19:46 < smcv> if your upgrades are unattended then your error reporting also needs to be unattended 19:46 < smcv> logcheck exists 19:46 < smcv> so do nagios and friends 19:46 <@marga> Yup 19:47 < smcv> I'm not sure that fixating on "but what about restarts" is proportionate 19:47 < smcv> daemons can crash any time 19:48 <@marga> This discussion is getting longer than I had originally expected. It seems to me that we are mostly in agreement that maintscripts failing are generally undesirable, if maybe not in 100% agreement of how undesirable they are... How should we move forward? 19:48 < smcv> it seems like there would be consensus for a statement with some weasel words in it, at least 19:49 * gwolf asks weasel for his words 19:49 < gwolf> marga: I think we have to keep in mind spwhitton's request when he opened this bug... 19:49 < smcv> a service failing to restart should be logged prominently in the system log and the maintainer script's stderr, but should not usually[1] cause the maintainer script to fail, unless there is a really good[2] reason why it must 19:50 <@marga> are 1 and 2 the weasel words? 19:50 < smcv> (this footnote intentionally left blank) 19:50 <@marga> heh 19:50 < gwolf> ...oh - never mind my last sentence - he asks us to _decide_, not to _advise_ 19:50 <@marga> Ok, I'll take the action item of writing up something, and send it to the bug. 19:51 < ntyni> he seems to be asking for advice on whether we should decide 19:51 < ntyni> :) 19:51 <@marga> #action Marga to write up a summary of the discussion here and send it to the bug. Discussion to continue there. 19:51 < gwolf> smcv: I'd go a bit more general than what you suggest - "A service failing to restart should signal the administrator in a prominent but nonintrusive way".? 19:51 < gwolf> (i.e. we don't do design work) 19:51 < smcv> sure, what I have in mind is "what we do now? do that" :-) 19:52 < gwolf> The important part, where we all agree, is that we want dpkg to fail the least possible, and leaving a broken maintscript does not help the user. 19:52 < smcv> yeah 19:52 < ntyni> yes 19:53 < smcv> broken pre*: well if you absolutely must (clue: if you need to ask, you don't) 19:53 < smcv> broken post*: just say no 19:53 < gwolf> :) 19:53 <@marga> :) 19:53 <@marga> Cool, let's move to our third topic 19:53 -!- marga changed the topic of #debian-ctte to: Any interesting things to share from DC18? 19:53 < gwolf> Well, I think it is up to me :) 19:53 <@marga> Yup 19:53 < gwolf> We had our annual ctte bof, which went smoothly but lacked a bit IMO 19:54 < gwolf> (being me the presenter, and presenting slides written by OdyX originally) 19:54 < gwolf> ...There were conversations mainly regarding our two bugs that several of us had, mainly with the submitters, but also with some other interested people 19:54 < gwolf> but all in all, I don't think there's much to report 19:55 < gwolf> ...any questions? :-] 19:55 < ntyni> thank you for handling the bof 19:56 < gwolf> Did any of you follow it? 19:56 < gwolf> Or any IRC conversation that happened during it? 19:56 <@marga> Thanks, I guess that's what I wanted to know. I haven't actually watched any talks from DebConf 19:57 < ntyni> I did follow it 19:57 < ntyni> the discussion wasn't very lively 19:57 <@marga> :( 19:58 < ntyni> but it was certainly good to have it I think 19:58 < gwolf> nope. Now again, we are at a point where the ctte has been dormant-ish for three months 19:58 < gwolf> I was mostly interested in getting other people to become interested in joining 19:58 < gwolf> but don't think I managed to raise too much enthusiasm 19:58 <@marga> Yeah, that will be a topic for next month's meeting 19:58 <@marga> (we said we would pause recruiting until Sept) 19:59 < gwolf> right 19:59 <@marga> Anyway, we are almost at the hour... 19:59 -!- marga changed the topic of #debian-ctte to: Any additional business? 19:59 < smcv> would it be helpful to have more emphasis on "ask the ctte for advice"? 20:00 < smcv> there are a couple of gnome bugs where my thought is "what would I even do about this" 20:00 < smcv> maybe formally asking the ctte about them would set a good example? 20:01 < gwolf> smcv: could be. We are not drowning in work, as you see ☺ Then again, not every weird issue should be brought to the ctte. Hopefully... 20:02 <@marga> Well, if it's just asking for advice is different than asking for a ruling in a dispute 20:02 < smcv> I'd only consider that for >= important bugs tbh 20:02 <@marga> We've had a bunch of bad experiences last year with disputes going off-track. We haven't had much advise asking until these two bugs from Sean 20:03 < smcv> yeah that was partly why I thought it might set a good example 20:03 < gwolf> marga: the bad experiences were... modem-manager? and..? 20:03 <@marga> gwolf, uhm... someone else who also orphaned their package rather than engage with the TC... 20:04 < smcv> instead of reserving appeals to the ctte for "I want to overrule this maintainer", try to encourage people to come to the ctte with "I'm considering this workaround for a broken situation but I don't know if I should" 20:04 < gwolf> right. Well, FWIW (and on-the-record, as we are still formally in meeting), I was saddened and surprised to reaed Guillem's answer to me asking for his stance on #904302 20:05 <@marga> Me too and I was going to bring this as a subject to discuss today if we had time, but we are overtime now, so I think I'd rather table it for next month. 20:05 <@marga> smcv, I think it would be a nice experiment 20:06 < ntyni> Guillem has made his opinion about the ctte clear on previous occasions too 20:06 < smcv> ok, will look at summarizing #896019 and/or #888549 into a form I can ask for advice on 20:07 <@marga> Alright, I think that would be endmeeting, unless someone has some urgent matter to bring up? 20:07 * gwolf shuts up 20:07 < ntyni> nothing from me